10.4.2 Networking Issues

Is this a feature? Is this a bug? Give us feedback to make it better!
Post Reply
User avatar
JohnGrimshaw
Posts: 1233
Joined: Tue Oct 16, 2007 12:51 pm
Primary Venue / Use: Other
Where I Am: International Man of Mystery
Location: Sydney, Australia
Contact:

10.4.2 Networking Issues

Post by JohnGrimshaw » Thu Sep 18, 2008 8:33 am

Running 2x VL16 consoles (so I have to run V10.4.2). One is the primary console, and is located in the broadcast van for the TV shoot. Is runs the DMX out, does all the "save show" business, and is under the control of the LD to run Cues. The second is used as a remote slave, out in the studio area so that moving lights can be placed with precision by the floor elecs, and various cues/looks can be updated on the fly.

Every 20min or so, the network console freezes, and has to be crashed out and Remote Palette restarted (thankfully, not a full reboot). You can tell it freezes because the clock on the display stops. I also have a freeze where the encoder panel stopped working, but the mouse continued to function.


Fixed the issue of the consoles having the same windows network name, so I don't get that annoying "duplicate name" dialogue anymore.

The network was running a DHCP system to allocate IP addresses. At the suggestion of Bill Richards (gotta love the timezone advantages!), I have now bypassed this and given the console fixed IP addresses. I did not have the time to test this to see if this fixed the system, but do you guys know if there is any other reason why the network connection would drop every 20-30min?

BY THE WAY...
THIS is why a TRACKING console should be a fully operational REMOTE, and why two consoles in a tracking situation should be able to hand control to eachother infinately. What I mean is one console crashes and reboots, the other console takes control, and remains in control unless it crashes too, in which case the first console (now rebooted) takes back control. As far as the operator is concerned, they can still run the show from the main interface, even if it is the other console actually running things.

THIS is what would allow Strand to sell more consoles, because a fully operational console, used as a backup and as a plotting console is a real time saver in shows that run in extreme time pressures!!!!
...and for more entertainment industry trivia and useless facts, just ask:
John Grimshaw
Managing Director
Stage Fast Pty Ltd

MattKlasmeier
Posts: 491
Joined: Tue Oct 23, 2007 1:41 pm
Location: Cincinnati, OH
Contact:

Re: 10.4.2 Networking Issues

Post by MattKlasmeier » Thu Sep 18, 2008 9:18 am

John,

From what you are describing is that you want the consoles to track like the 500's did. The problem with full keystroke tracking (like the 500's) is that the command that crashed one console would be duplicated on the other and could take down the backup.

As for your network issue I would check the switch settings to ensure that the switch is not shutting down the connection. Broadcast storm control comes to mind. If you have the time I would try linking the consoles with a crossover cable to see if that solves the problem. That way you know if the issue is with the network or the consoles. Does the second console run stable by itself?

User avatar
RobertBell
Posts: 2421
Joined: Fri Oct 12, 2007 1:11 pm
Primary Venue / Use: Other
Where I Am: Horizon Control Inc
Location: On the dark side just north of Toronto
Contact:

Re: 10.4.2 Networking Issues

Post by RobertBell » Sat Sep 20, 2008 2:40 pm

Along Mat's thoughts... I wonder if the DHCP was expiring leases every 20 mintues. I'll be interested to know if the static IPs make the difference. Also true that we should be more tolerant of bad connections.
Robert Bell - Product Manager - Horizon Control Inc.

User avatar
JohnGrimshaw
Posts: 1233
Joined: Tue Oct 16, 2007 12:51 pm
Primary Venue / Use: Other
Where I Am: International Man of Mystery
Location: Sydney, Australia
Contact:

Re: 10.4.2 Networking Issues

Post by JohnGrimshaw » Sat Sep 20, 2008 4:28 pm

Static IPs did not make a difference. I asked the guys to re terminate the redundant Ethernet line as a crossover cable, and see if a direct console to console connection would eliminate the fault. I will know more on Monday.
...and for more entertainment industry trivia and useless facts, just ask:
John Grimshaw
Managing Director
Stage Fast Pty Ltd

User avatar
JohnGrimshaw
Posts: 1233
Joined: Tue Oct 16, 2007 12:51 pm
Primary Venue / Use: Other
Where I Am: International Man of Mystery
Location: Sydney, Australia
Contact:

Re: 10.4.2 Networking Issues

Post by JohnGrimshaw » Wed Sep 24, 2008 10:47 pm

Changed Hub, and tested for 70min. All working fine. The hub was the problem.

Also re-wired the redundant Ethernet cable as a crossover cable, and tested that. It works as well.

So...

Palette needs some "networking" work so that if a network dropout occurs, the whole system does not "freeze" up. It should automatically wait to reconnect, and periodically try to do this itself.
...and for more entertainment industry trivia and useless facts, just ask:
John Grimshaw
Managing Director
Stage Fast Pty Ltd

User avatar
gooze
Posts: 1760
Joined: Tue Dec 18, 2007 12:42 pm
Location: Amsterdam, The Netherlands
Contact:

Re: 10.4.2 Networking Issues

Post by gooze » Thu Sep 25, 2008 3:35 am

Was this an active hub that buffers the tcp/ip packets and thus changes the timing of it all?
Floriaan Ganzevoort - Lighting designer
THEATERMACHINE design. production. operations.

GaryDouglas
Posts: 689
Joined: Thu Oct 11, 2007 9:33 pm
Location: Calgary, Canada

Re: 10.4.2 Networking Issues

Post by GaryDouglas » Thu Sep 25, 2008 1:31 pm

The problem is the underlying stack usually has very long timeouts for TCP connections. Typically 30 seconds or more UNLESS the link goes away.

So, only if the glitch lasts 30 seconds or more do we find out about it (in the application) at all. These timeouts can some times stretch to infinite, depending on where the error lies. In this case, since the link was staying up, the system trusted that the data was being delivered by the switch, and was waiting for the reply (infinitely I assume) which was never going to come.

The danger with monkeying with these timeouts (we CAN force them to be very short) is that if we're running over low bandwidth links (as some do in the case of remote over internet) the connections will be continually reset. This is not good either, depending on how the system is being used. There's really no way of determining an ideal value for all applications. Because of the wide range of media TCP/IP can be carried on, it is sometimes completely normal to wait a long time for replies, or to receive them in bursty, chopped up forms.

The best course of action is to be very vigilant about the network. There are ways that users can protect themselves against failures like this by using IT grade devices with automated failover protections, redundant links, and so forth -- or simply a crossover cable to get rid of any "intelligence" causing issues in the middle.

The unfortunate issue is that these failures are seldom absolute. Many are simply degradations in performance that amplify the problem of timeouts and waiting, and may never make themselves obvious to the user. There's little we can do in the application space to combat this without causing other, more serious problems elsewhere.

If anyone has any ideas, I welcome suggestions...
Gary Douglas - Lead Software Developer - Pathway Connectivity - A Division of Acuity Brands Lighting Canada.

User avatar
JohnGrimshaw
Posts: 1233
Joined: Tue Oct 16, 2007 12:51 pm
Primary Venue / Use: Other
Where I Am: International Man of Mystery
Location: Sydney, Australia
Contact:

Re: 10.4.2 Networking Issues

Post by JohnGrimshaw » Thu Sep 25, 2008 5:00 pm

How about a simple "heartbeat" signal, sent by the master console? Lose the heartbeat, and the remote "re-asks" any unanswered questions, or "reconnects".

The obvious heart beat could be embedded in the clock, which ticks away on the bottom right. We new the system hung the second when the clock stopped.
...and for more entertainment industry trivia and useless facts, just ask:
John Grimshaw
Managing Director
Stage Fast Pty Ltd

User avatar
TaineGilliam
Posts: 1184
Joined: Tue Oct 23, 2007 5:15 pm
Location: Cleveland, OH
Contact:

Re: 10.4.2 Networking Issues

Post by TaineGilliam » Thu Sep 25, 2008 5:09 pm

Not that anyone is really considering but so much restructuring, for timeout issues I'd be looking at the threading model and the considerations for blocking vs. non-blocking network I/O. Unfortunately, I suspect these are choices dictated by the development tool library as much as anything else.

An alternate approach to network status monitoring would be an UDP heartbeat. After a certain period of time without a pulse consider the device AWOL. I believe there was mechanism like this in the days of SN1xx nodes and 300/500 consoles. I'd probably look at multicast instead of broadcast to facilitate structure on larger networks... Is there anywhere in the ACN stack of protocols that addresses this form of heartbeat?

Taine

User avatar
JohnGrimshaw
Posts: 1233
Joined: Tue Oct 16, 2007 12:51 pm
Primary Venue / Use: Other
Where I Am: International Man of Mystery
Location: Sydney, Australia
Contact:

Re: 10.4.2 Networking Issues

Post by JohnGrimshaw » Fri Sep 26, 2008 6:57 pm

Here is what I would like... Show is running on console 1, and is designated the "main" console. Console 2 attaches and Console 1 sees a new networking dialogue, and some options for the MAIN console to select...
-------------------------------
Authorise Networking Console...
Console: 192.168.0.99
Control Type? View Only or Remote Control
Tracking Backup? Yes or No
Enable Remote Output? By User or Automatic or Enable DMX Now
-------------------------------
Control Type: Allows that console surface to control the show or not

Tracking Backup: Makes the attaching console get a copy of the show, and will become the "main" console if the first console dissappears. The Main Console can "optionally" retake control after reboot.

Enable Outputs: Answers these questions:
1. Is this console spitting out DMX out its hard ports right now? (good for a remote visualiser when running a show over a WAN)
2. What happens to lighting data output if a tracking console "takes control"?

With the above the "main" console can selectively activate and deactivate remote control on the fly. Each console knows the IP address of each other console in the "stack", with an order of priority as to who takes control. You could even put the RFU in there.
...and for more entertainment industry trivia and useless facts, just ask:
John Grimshaw
Managing Director
Stage Fast Pty Ltd

Post Reply