If your Mavic falls out of the sky...

sar104 · Dec 9, 2017

BJR981S said:
Its only meaningless, and technobabble to those that are ignorant of software coding.

From one that has over 10 years as a professional coder, and architected the Windows Intellimirror technology introduced in Windows 2000 I do know what I am talking about.

The first step to enlightenment, is an open mind to what you don't comprehend.

Appeal to authority isn't going to make it any better. If you knew what you were talking about you would have made a cogent argument rather than just tossing out random jargon to support an assertion that made no sense to start with. Vibration does not cause buffer overruns, no matter what you may have "architected".

BJR981S · Dec 31, 2017

I would have given a cogent argument but as you have no background in computer architecture and coding it would run into pages and I have found people with closed minds like yours do not go to the trouble of reading more that 5 lines.

Given it is a subject that you have no background in I find it amazing that you are so definite.

I guess you are a Google expert. That is your knowledge is backed up by Goggle searches. The only problem with this is that you need to search with the correct criterion.

DJI themselves warn of the impact of excessive vibration. It is plastered all over the net.

sar104 · Dec 31, 2017

BJR981S said:
I would have given a cogent argument but as you have no background in computer architecture and coding it would run into pages and I have found people with closed minds like yours do not go to the trouble of reading more that 5 lines.

Given it is a subject that you have no background in I find it amazing that you are so definite.

I guess you are a Google expert. That is your knowledge is backed up by Goggle searches. The only problem with this is that you need to search with the correct criterion.

DJI themselves warn of the impact of excessive vibration. It is plastered all over the net.

Ah - the tried and trusted ad hominem defense - "I would have explained but you would not understand". I'm sorry, but I seriously doubt any of your alleged credentials - you misuse too many technical terms and concepts and you are clearly incapable of explaining anything in this field. How about starting with a really easy assignment - cite even one reference that supports your assertion, or maybe a link to any DJI statement that vibration will cause the FC to reboot? That can't be too difficult, can it?

ac0j · Dec 31, 2017

LOL! This thread gets the prize for the highest use of $20 words...... Funny stuff with all the chest puffing and such!

So, with respect for the OP, the wire breaks off of some batteries when you slam them on the ground?

BIGDAVE · Dec 31, 2017

Brojon said:
There've been many conversations about Mavics just dropping out of the sky and one of the recurring "likely scenarios" is that the operator is assumed to have not inserted the battery properly and it somehow came loose. That may not be the case...
This past week I had my MP rise suddenly and smack a bridge - fall down go boom.
The MP is deader than a doornail but I noticed an oddity in that the battery was still showing on - two solids and 1 blinky LED. So I popped it into the charger and got the expected 1 LED blink but then it went out and the charger showed red.
Being an experienced electronics guy I said "hmmmm" and proceeded to disassemble the battery. BTW this is non-reversible since they use glue or ultrasonics to hold the case together.
Guess what I found? A clean separation of the positive terminal from the battery block to the controller! Now granted the drone fell 30 feet or so but it hit relatively soft mud. This is why the battery showed ok - it can only read the cells internally via the small cable with 4 wires reads each of the 3 cells.
So it would appear that DJI needs to check their quality on soldering the battery lugs - you can see the lug is welded to the actual battery post leading me to believe the board is assembled with the power leads, then the lug is soldered to the heavy wire likely by hand given the excess solder. You can see a little crystallization of the negative terminal showing a borderline solder joint but it is pretty solid - I tried to pull it off but ultimately I had to cut it.
Just thought I would toss this out there - I will be sending the photos to DJI once I get the myriad info they require to file a support incident. I don't expect anything from them except that they check into the issue.
Meanwhile I'd treat the batteries with extra care - vibration and jarring could lead to a failure if you have a battery with one of these poor joints.

UN BELIEVABLE. SORRY for the crash. funny how this is happening the solo pilots seem to be having a sim prob with the soldering on the boards with their birds falling out of the sky. we invest this kind of money I wish there was something we could do to fix these problems.

Brojon · Dec 31, 2017

ac0j said:
LOL! This thread gets the prize for the highest use of $20 words...... Funny stuff with all the chest puffing and such!

So, with respect for the OP, the wire breaks off of some batteries when you slam them on the ground?

No - the TYPE of break indicates that it was a cold solder joint which is a manufacturing defect. As such any number of different ways for it to fail.
Let's not try and ridicule a valid observation.

BJR981S · Jan 1, 2018

sar104 said:
misuse too many technical terms

Quote one.....

I am not going to bother explaining to you any more, you have missed the point entirely. This is as I said; about Computer programming and errors in the code caused by excessive data input and or variation. The same as that, that causes Fly Aways.

This is why DJI continue to enhance the FW with bug fixes.

I will explain in complete detail if other members of this forum Ask for the detail.

Cheers

Brojon · Jan 1, 2018

I confess I missed this dispute, but it sounds sorta like a discussion I'm having on the Phantom forum.
A noted computer scientist Gerald Weinberg noted that it could be argued that since there are very few Grand Chessmasters in the world, as such they represent a pinnacle in human problem solving since they can look ahead up to seven moves in a chess game (a very very large combinatorial/permutation problem space) and have internalized hundreds if not thousands of chess openings and entire games that can be looked at like templates or patterns of good play.
He established the level of complexity mathematically then went on to show that even seemingly trivial problem spaces in computer programming far exceed the level of playing chess at the grandmaster level. He makes the argument that it is a virtual certainty software contains bugs as a result.
Given that the remote, the battery and the drone are all controlled by software and they respond to all sorts of user commands as well as redundant flight systems sensors I can easily envision a combination or unexpected input values that "does not compute" and results in unexpected behavior. I suspect that the internal model of a drone is a driven by state machines with subsystems digesting data reporting to control systems which make decisions - wheels within wheels. State machines can find themselves out of sync or in the "wrong" state with unexpected or unaccounted for inputs or combinations of inputs.
OT but of interest to me - MIT robotics researchers did some work where they were looking at alternatives to state machines by modelling behaviors - especially emergent behaviors. A most promising avenue was what they called sublimated behavior. Imagine a cockroach.having three behaviors; procure food, avoid open spaces, light and noise (danger), and procreate. Say the cockroach is hungry. The primary behavior is one of food procurement which involves a search plan based on patterns (think Roomba). So here's the cockroach looking for food and the search pattern takes him out to the open kitchen floor. The open space is a danger albeit not imminent - call it Defcon 5 - heightened awareness. The roach detects a vibration - a footstep? Defcon 4 - let's mosey closer to the wall in case. The light comes on and the roach scrambles for cover.
The current behavior has been modified by sensory inputs and the primary activity tempered as a result. Finally the current behavior is entirely replaced as a result of changing environment and priorities.
I think it would be fascinating to apply some of this thinking to our drone control software since in many ways they are semi-autonomous robots making decisions based on a myriad of information - some of which we are generally unaware of.
I wasn't just rambling to no purpose - it seems that given the complexity of a typical drone perhaps a new modelling approach is called for with the software since it's difficult to impossible for software to attain 100% determined behavior.

przero · Jan 1, 2018

I'll bite....BJR981S would please explain in detail, especially citing DJI references. Thanks!

sar104 · Jan 1, 2018

BJR981S said:
Quote one.....

I am not going to bother explaining to you any more, you have missed the point entirely. This is as I said; about Computer programming and errors in the code caused by excessive data input and or variation. The same as that, that causes Fly Aways.

This is why DJI continue to enhance the FW with bug fixes.

I will explain in complete detail if other members of this forum Ask for the detail.

Cheers

Quote one? That's pretty rich coming from someone unwilling to substantiate a single assertion in this entire thread. But okay - your latest post includes a perfect example:

"about Computer programming and errors in the code caused by excessive data input and or variation."

I cannot conceive of a computer scientist writing that sentence.

Errors in code are the result of bad coding - they are not caused by subsequent data input.

"Excessive data input"? Excessive as in too much data? The wrong kind of data? Excessive is not a term applied to data input.

"Variation"? As in the difference between the maximum value and the minimum value (range)? To even have a chance of making sense, it would need to be the maximum value, not the variation, that mattered, and even then it would be exceptionally poor programming if the IMU could not cope with the output range of its sensors packages.

Or take your previous statement about vibration damping being required for FC/IMU systems per DJI. DJI made it clear, in the link that you posted, that the problem with vibration (e.g. due to unbalanced props) is that it can lead to poor flight characteristics, not that it crashes the IMU or FC. And yet apparently you cannot grasp the difference between those two entirely different simple concepts.

Unless you are willing to back up any of your assertions with references I'm not going to waste any more time with this.

BJR981S · Jan 1, 2018

Thanks Brojon.

If this conversation continues with you I will tell you a story about how unexpected results from a computer system are not always detrimental. The preverbal Ghost in the machine.

I am still trying to keep it within the context of people in this Hobby not IT developers. So its a bit Layman in its context.

Software and Hardware stability is a significant issue in the IT industry. You can see how Microsoft struggles with Security issues and realises patches monthly to fix security holes (Bugs) in the code.

You are 100% correct. There is no such thing as bug free code.

Every time an enhancement is released new bugs appear as customers push the limits of the data interaction.

Many years ago I constructed an AI based computer control system for the metal finishing industry. It was an education into Microprocessor based issues when you are constrained by both performance and memory.

The EEprom used in most micro control systems is very slow. As is the CPU, it cant be clocked very fast as it would generate too much heat. You would normally only use EEprom as a bootloader and load the actual code into faster memory. Not much chance of this when you are dealing with a miniature FC.

Hardware and Basic OS Architecture.

There are only 3 basic architectures you can implement with Microprocessor technology. I don't know what is inside a DJI FC.

You either:

Poll the Hardware constantly. Least effective
Use a hardware clock to generate a regular interrupt. Moderately effective.
Use interrupt driven hardware input. This is the most efficient and effective. Often integrated with a Real time clock interrupt as well.

I also don't know if the actually have an operating system as such. I would assume so as it would allow them to develop FC systems and invest in a code base.

Here are the potential problems with the Hardware.

Polling. This consumes 100% CPU constantly and means there is no headroom for anything unexpected. It basically has to check every piece of incoming data stream even if no data is present. If it spends too much time on one task due to complexity in teh data stream then it will fail to recognise events within the poll cycle.

Hardware Clock. This is much more effective than polling but it requires that teh OS is smart enough to not miss any hardware events. It too is susceptible to a data stream hogging CPU cycles.

Hardware Interrupt. A hierarchical hardware interrupt scheme is the most efficient it can via the interrupt level prioritise high priority data streams above lower levels. An example of this would be proritising flight level stability. Reading Gyro / Accelerometer data rather than writing to the log on the SD card. This is the most effective and efficient but is completely at the mercy of how fast the data stream is generating interrupts. And this is dependant on the interfaces to the auxiliary sensors. Again I have no idea how DJI integrate the HW and Sensor technology. I suspect that a lot of it is imbedded in the main LSI (Large Scale Integration)

But is most cases the Hardware is pretty reliable, you can have some nuisance sensor noise as an example that could effect flight stability. But a HW failure with the FC will result in an irrecoverable fall from the sky.

Software and Development Tools..

This is where the problems really surface.

When I created the AI system in the mid 80s I wrote the OS the Interpreters and compilers in Assembler (Machine Code) This gives you the best control of hardware. But it is very slow and cumbersome writing complex code this way.

With the advent of 3 and 4GL languages your use of the language needs to be precise and you also suffer from bugs in the actual compiler itself.

Hence I always recommended "don't do anything out of the ordinary" in your coding practice, don't try and be too clever. It can exacerbate any errors in the compiler itself as you are covering new ground for the compiler. It, in most cases fell on deaf ears when I did code reviews. Its like telling a fighter pilot to fly slow. Doh.

The majority of code written today is in C there are zillions of C programmers as it was created and used to educate software Developers in Computer science classes in Universities.

It's one of the worst languages for practical application due to its intense issues with pointer management.

The other advice I always gave to student coders was never forget that 0 is a positive integer.

There are many bugs written due to this misconception. When you are dealing with tabular data in a buffer you deal with a base and an index (Pointer.) The first entry in the table (Base plus index) is index 0.

When I was working with Microsoft (JDP lead) in the late 90s on Windows 2000, they used a third party tool to scan all the Windows NT code for poor coding practices with pointer management. Thousands of bugs were eliminated in this process.

Yes Windows OS is written in C.

The other biggest issue is lazy coding. I mentioned it in an earlier post. Type Checking and Data Type Declaration. This really exposes issues when new functionality is implemented. (Type Checking) and when unexpected data like excessive vibration (Large Data Swings) occurs.
(Data Type Declaration)

Here are the potential problems with the Software / Firmware.

The are 2 finite resources in the FC. CPU cycles and Memory.

The OS manages the memory resource by allocating it in Pools / Buffers (If any one is interested I can explain how the simplest of these pools are managed. Its called a linked List) The compiles code itself uses memory pools but invisible to the programmer its imbeded in the Assembler output of the complier. CPUS only run machine code. Well some like the i3 to i7 Intel chips actual run microcode to make them compatible with earlier x386 and X64 AMD technology machine code.

So here are some logic errors that crop up constantly with Type Checking.

The logic of the written code might deal with parameters that run from 1 to 10. A lazy programmer will define options from 1 to 9 and if its not any of them then it must be 10. So option 10 Becomes a "catch all".

What happens if the parameter 11 is passed? or any number above 10?. Its treated like a 10. If this happens in a FC then imagine what calculations it might do based on the wrong data.

Examples of this occurs when new functionality is introduced like Pano shots. you would haev logic that determines what style of photo needs to be taken. If a logic section of code is missed when the feature is added. Unpredictable results may occur when selecting pano.

Proper Data Checking would ensure that the input data is checked that it is within bounds of the logic routine that is handling it. If an out of bounds parameter is passed you would then execute exception handling.

So here are some logic errors that crop up constantly with Data Type declarations.

I don't want to get boring here but when you define something "Data" to the compiler. You have to be specific about the physical size of what you are declaring.

If you want to define a pointer as an example you could define it as an long or short integer. The size is important as you use lots of data definitions that consume memory and memory is finite. So you declare the minimal size required.

If you can imagine a "word" thats 16 bits of memory in storage. 16 bits gives you a range of numbers from 0 to 1023 if its defined as a positive integer. It can also be a number from -511 to +511.

Of the 16 bits in storage the most significant bit (the one to the max left is a sign bit). If 0 its positive if 1 its negative. Can you image if delivered incorrectly what would happen if the logic incremented this data word until it reached 512 or above?

Suddenly it would become negative and the pointer you are using would no longer point to the memory pool you are using but somebody else's. You are about to stomp all of random memory. In a lot of cases this would / Could try and write to protected ring 0 operating system memory. This would then cause the CPU to trap. In the Windows world that is a Blue Screen of Death.

So this are a couple of ways that poor coding can create a bug that would only occur when something excessive happens in the input data stream.

The particular case at point and the original Statement I made about excessive vibration goes like this.

The FC reacts to input from the Gyros and Accelerometers.

The FC will react to this data. Depending on the FC architecture it will read the parameter data. It may do this every 20 milliseconds. (example only) It most cases it will be the same as it was last time so the FC will basically ignore it. When it does get a change in input data it will process it and determine what it needs to do to the motors to stabilise the model.

Under normal flight conditions thais may consume a small amount of CPU cycles and memory allocation. i.e. it doesn't ahem to do this very often in relation to the time it spends sampling input data.

Now imagine that you are getting excessive vibration from your model. When the data is sampled it now need to process it every single sample. So it has to calculate a reaction to the vibration pulse and program a change in motor speeds.

This then starts to consume more and more CPU cycles and more and more memory.

It starts to go down logic paths that are not normally used.

i.e. it is entering coding paths that may never have been executed before. If the coding is perfect. Then not a problem. If not, you will get a FC trap and a fall from the sky.

I hope this dissertation helps clarify what I was talking about.

Cheers Brian

p.s. I have not checked this in detail for Spelling or grammar errors. I may have to edit. Sometime autocorrect (autocorrupt) can change the sentence meaning.

Point out anything that is questionable.

Brojon · Jan 1, 2018

sar104 said:
Quote one? That's pretty rich coming from someone unwilling to substantiate a single assertion in this entire thread. But okay - your latest post includes a perfect example:

"about Computer programming and errors in the code caused by excessive data input and or variation."

I cannot conceive of a computer scientist writing that sentence.

Errors in code are the result of bad coding - they are not caused by subsequent data input.

"Excessive data input"? Excessive as in too much data? The wrong kind of data? Excessive is not a term applied to data input.

"Variation"? As in the difference between the maximum value and the minimum value (range)? To even have a chance of making sense, it would need to be the maximum value, not the variation, that mattered, and even then it would be exceptionally poor programming if the IMU could not cope with the output range of its sensors packages.

Or take your previous statement about vibration damping being required for FC/IMU systems per DJI. DJI made it clear, in the link that you posted, that the problem with vibration (e.g. due to unbalanced props) is that it can lead to poor flight characteristics, not that it crashes the IMU or FC. And yet apparently you cannot grasp the difference between those two entirely different simple concepts.

Unless you are willing to back up any of your assertions with references I'm not going to waste any more time with this.

I'm NOT trying to validate any of these - especially since I'm not privy to the actual code involved. But I can surmise the intent of the statements.

Sensor data acquisition routines are usually interrupt driven - in reality something (like a hard-wired interrupt) triggers polling, the data are pushed onto the stack and a consuming routine is called. Some interrupt systems are variable in that if a delta threshold is tripped the frequency of acquisition goes up - IOW if things are happening fast then it behooves to sample faster to keep on top of things. Interrupt driven systems can thus cause issues with hogging resources which can cause failures in unrelated areas. Unix has experimented with many task switching algorithms over the decades with varying degrees of success but task resource hogging and creating bottlenecks can still be an issue even with lots of horsepower. I have coded priority queues with a Kalman filter assigning priorities based on history. Even so care must be exercised to account for other tasks. There's a reason Windows PC's, Macs and Linux machines all have apps to analyze and manually intervene with task priorities even to the point of killing them and/or respawning.
The Dji drones have an incredible quantity of data they're crunching - we can actually see - at a high level - how often certain data are sampled in the flight logs. I would not be surprised at all to be told that there have been failures due to task loading issues.

While it is true not coping with input value ranges is "bad programming" it does happen even today with mission critical systems. Buffer overflows and SQL injection are still :"things" in spite of vast literature on prevention of such faults. Refer to my prior note made by Gerald Weinberg - we are all flawed to varying degrees and business needs (desires) sometimes pushes people into tight deadlines - never a good thing.

I could actually see where attempting to compensate for excessive vibration could lead to regenerative feedback or even engine shutdown causing a drone to fall.

Anyway - not to say if any of this is possible or probable, just saying I think I understand what was being alluded/stated.

Brojon · Jan 1, 2018

BJR981S said:
Thanks Brojon.

If this conversation continues with you I will tell you a story about how unexpected results from a computer system are not always detrimental. The preverbal Ghost in the machine.

I am still trying to keep it within the context of people in this Hobby not IT developers. So its a bit Layman in its context.
.

I hear what you're saying - much of it sounds kinda outdated but that is likely a byproduct of simplification.

Pretty sure Dji bases the code on Android which is a Linux derivative. As such it's pretty stable for the most part. Likely the processor is ARM based - probably multicore.
Where I suspect some issues might creep in is Dji modifying the kernel for better real time performance - part of that is the task scheduler which is a finicky thing to be messing with. Thread management and resource compartmentalization/access are additional layers of complexity in the OS.

Your info on sensors however is a little off IMHO. Most sensors these days are no longer "raw data" they have already been processed by onboard circuitry and presented as signed digital data. You'd have to be inept to a remarkable degree to go out of bounds on the data values. That said I have personally seen people who code in limits then hardware changes - perhaps the engineers upgraded a 12 bit sensor to a 14 bit. Now we could have issues.

sar104 · Jan 1, 2018

Brojon said:
I'm NOT trying to validate any of these - especially since I'm not privy to the actual code involved. But I can surmise the intent of the statements.

Sensor data acquisition routines are usually interrupt driven - in reality something (like a hard-wired interrupt) triggers polling, the data are pushed onto the stack and a consuming routine is called. Some interrupt systems are variable in that if a delta threshold is tripped the frequency of acquisition goes up - IOW if things are happening fast then it behooves to sample faster to keep on top of things. Interrupt driven systems can thus cause issues with hogging resources which can cause failures in unrelated areas. Unix has experimented with many task switching algorithms over the decades with varying degrees of success but task resource hogging and creating bottlenecks can still be an issue even with lots of horsepower. I have coded priority queues with a Kalman filter assigning priorities based on history. Even so care must be exercised to account for other tasks. There's a reason Windows PC's, Macs and Linux machines all have apps to analyze and manually intervene with task priorities even to the point of killing them and/or respawning.
The Dji drones have an incredible quantity of data they're crunching - we can actually see - at a high level - how often certain data are sampled in the flight logs. I would not be surprised at all to be told that there have been failures due to task loading issues.

While it is true not coping with input value ranges is "bad programming" it does happen even today with mission critical systems. Buffer overflows and SQL injection are still :"things" in spite of vast literature on prevention of such faults. Refer to my prior note made by Gerald Weinberg - we are all flawed to varying degrees and business needs (desires) sometimes pushes people into tight deadlines - never a good thing.

I could actually see where attempting to compensate for excessive vibration could lead to regenerative feedback or even engine shutdown causing a drone to fall.

Anyway - not to say if any of this is possible or probable, just saying I think I understand what was being alluded/stated.

Yes, I think you are correct. I understood what he is claiming - sensor values outside some expected range cause buffer overrun and crash the FC. However, as you noted in your later post, that's not how it works with modern sensor packages, or even older A/D conversion and acquisition unless something extremely clumsy was going on.

In any case, firstly, there are no references to this happening with any of these flight control systems (he misunderstood the DJI statement on vibration control) and, secondly, after reading his essay to you I realized that he actually probably was a programmer, even if he is unfamiliar with this kind of data processing, and that something else entirely is likely responsible for his muddled posts. I'm going to leave him alone now.

SwatKat · Jan 2, 2018

This thread has been a fascinating read for the most part. As for the Programming-101, nothing new there. Just basic programming do's and don'ts. Now what I fail to see in that super long post of Brian's is a solid substantiation of how vibrations cause buffer over-runs.

BJR981S said:
So this are a couple of ways that poor coding can create a bug that would only occur when something excessive happens in the input data stream.
Point out anything that is questionable.

Also, I don't quite understand the use of the word "excessive" for an input stream. An input stream is an input stream. If your code is not built to handle it, it is the programmer or architect's fault for designing a system that does not anticipate the input stream. Did you mean "unexpected" when you used "excessive"?

And if the DJI OS is Android based, then for most practical purposes, would it would be safe to assume that the kernel has developed enough today to handle buffer overruns? Wouldn't the DJI kernel be an RTOS? If so, would the programming paradigms for data acquisition and processing be substantially different from what they use in general purpose OS based systems (given that Brian is trying to equate or apply those concepts to a real-time, critical environment)?

Brojon · Jan 2, 2018

adwaraki said:
This thread has been a fascinating read for the most part. As for the Programming-101, nothing new there. Just basic programming do's and don'ts. Now what I fail to see in that super long post of Brian's is a solid substantiation of how vibrations cause buffer over-runs.

Also, I don't quite understand the use of the word "excessive" for an input stream. (1) An input stream is an input stream. If your code is not built to handle it, it is the programmer or architect's fault for designing a system that does not anticipate the input stream. Did you mean "unexpected" when you used "excessive"?

And if the DJI OS is Android based, then for most practical purposes, would it would be safe to assume that the kernel has developed enough today to handle buffer overruns? Wouldn't the DJI kernel be an RTOS? If so, (2)would the programming paradigms for data acquisition and processing be substantially different from what they use in general purpose OS based systems (given that Brian is trying to equate or apply those concepts to a real-time, critical environment)?

1 - sorry but you're way off on assumptions. Any stream or data channel goes into a buffer - a range of memory assigned to contain it - until it can be processed. There are two parts: a source and a sink (consumer). As the consumer consumes the data it updates a marker showing where it's at so the source can know where it's safe to write. Usually the buffer is a circular queue meaning that when the end is reached it starts over. Many systems have been devised but for real time control interrupt driven multitasking is pretty common. An interrupt means exactly that - a process can be put on hold when another process needs processing time. The state of the process is saved so it can in theory resume where it left off. Issues arise when the shared memory is changed because the programmer forgot to lock the memory or perhaps hadn't quite got to that point. The "architecture" for real time control is still a subject of research - it is not a cut and dried thing. You can design a system to work one way but when you have a non-deterministic system due to interrupt driven requirements it can get hairy - you cannot possibly anticipate every single combination of state a process may be in when a context shift occurs. What you can try to do is put in safeguards to help make them less likely and help the coder manage states.

2 - Of course they are. Real time requirements are much different that an OS that serves "normal" consumers. There are a plethora of real time OS out there to solve various needs. Sometimes the real-time is a subsystem of the OS, sometimes they replace modules of the OS (I suspect this is what Dji does), then there are OS that are designed from the ground up to be a RTOS - real time operating system. Notable among these is QNX.

BJR981S · Jan 10, 2018

adwaraki said:
This thread has been a fascinating read for the most part. As for the Programming-101, nothing new there. Just basic programming do's and don'ts. Now what I fail to see in that super long post of Brian's is a solid substantiation of how vibrations cause buffer over-runs.

Also, I don't quite understand the use of the word "excessive" for an input stream. An input stream is an input stream. If your code is not built to handle it, it is the programmer or architect's fault for designing a system that does not anticipate the input stream. Did you mean "unexpected" when you used "excessive"?

And if the DJI OS is Android based, then for most practical purposes, would it would be safe to assume that the kernel has developed enough today to handle buffer overruns? Wouldn't the DJI kernel be an RTOS? If so, would the programming paradigms for data acquisition and processing be substantially different from what they use in general purpose OS based systems (given that Brian is trying to equate or apply those concepts to a real-time, critical environment)?

No I used excessive to describe a data stream that has wild variations that would not normally be expected.

Buffer overruns is what happens when you are too busy doing other things. Like trying to react to widely varying input sensor data.

The DJI Flight controller is a complete DJI fabrication. So I don't know if they use standardise sensors or if the are DJI OEM. I expect the latter.

It would not be android based. Android as an OS is architected and built for Smart Mobile Devices. It would be too complex and too "Large" to use in a Flight Controller.

Also doubt that it would be Linux / Unix based. These OSs are optimised for High IO. Disk to CPU / memory to CPU Peripheral to memory (DMA) etc.

A FC would I suspect be based on an OEM created OS. It would be optimised for standard based FC operations.

System design and architecture is always a compromise on Size / Complexity / Cost / Time to market et al.

I think you may be missing the point. The errors / issues would not occur if all the below were valid:

Unlimited CPU performance.
Unlimited usable memory
Bug free code.

Excessive vibration is a factor that is not expected and drives the FC harder than normal operation.

Afremont · Mar 5, 2018

BJR981S said:
No I used excessive to describe a data stream that has wild variations that would not normally be expected.

Buffer overruns is what happens when you are too busy doing other things. Like trying to react to widely varying input sensor data.

The DJI Flight controller is a complete DJI fabrication. So I don't know if they use standardise sensors or if the are DJI OEM. I expect the latter.

It would not be android based. Android as an OS is architected and built for Smart Mobile Devices. It would be too complex and too "Large" to use in a Flight Controller.

Also doubt that it would be Linux / Unix based. These OSs are optimised for High IO. Disk to CPU / memory to CPU Peripheral to memory (DMA) etc.

A FC would I suspect be based on an OEM created OS. It would be optimised for standard based FC operations.

System design and architecture is always a compromise on Size / Complexity / Cost / Time to market et al.

I think you may be missing the point. The errors / issues would not occur if all the below were valid:

Unlimited CPU performance.

Unlimited usable memory

Bug free code.

Excessive vibration is a factor that is not expected and drives the FC harder than normal operation.

According to the guys that were reverse engineering the Mavic Pro firmware, it's a modified form of Android 4.4; or at least it was last year. I agree that there may be a separate flight controller module that is likely another ARM processor. I wouldn't be surprised that there are several distinct CPUs all communicating to the Android based portion.