[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [lwip-users] Device crashes while connected via TCP and Serial simul
From: |
Sergio R. Caprile |
Subject: |
Re: [lwip-users] Device crashes while connected via TCP and Serial simultaneously |
Date: |
Mon, 30 Jan 2017 10:54:53 -0300 |
User-agent: |
Mozilla/5.0 (Windows NT 6.1; rv:45.0) Gecko/20100101 Thunderbird/45.7.0 |
The micro you have is a Cortex-R, it does have an MPU.
Your system might be setting up protected spaces and that could trigger
exceptions. Those would more likely be at start, not sporadically in the
long run, but you need to be sure. You should know your init functions
and ne able to tell if the MPU is on and if there are protected spaces.
You should be able to get exactly who is causing that exception by
fetching in a particular place. That would probably point you right away
to the culprit without much guessing and debugging. I can't help you
with Cortex-R; I'm an -M guy. The MPU or the fetch unit triggered an
exception because an instruction in some address (who) wanted to access
memory in some place (where) where it is either non-existent (fetch
unit) or not allowed (MPU). Since you said it is a data fetch exception,
I bet it is the fetch unit and not the MPU, but it is your processor in
your hardware and you have to know its intricacies.
Once you have the address, you can check the map file the linker outputs
to find the function that is doing that. Then you can probably put a
breakpoint there and try to get why it is so.
Running an "I wrote it myself" application is not good enough,
particularly if you never did before. A known to work application is for
example those in the contrib tree (1.4.1) or in the app tree (2.0.x).
Run one of those apps AND your serial stuff, remove all kind of
communication between those, and when you are sure none of them is
blowing your system, then start writing each other's buffer.
If you could make the echo app run without a hassle, that is a good
sign, but not enough. I've seen that run on bad ports mixing contexts.
Netio would probably be the same, unless you run it as a master.
Again:
You must have:
NO_SYS=1
a main loop
some interrupt handlers.
You must call lwIP functions in only one context, either main or
interrupt, but not both. The code snippet you sent looks OK-enough to me
not to cause a blowup (but that proves nothing), but you should check
there is enough room before calling tcp_write().
You are sending frames on the main context. What do you do when the
hardware signals there is an Ethernet frame ready ? You must not call
lwIP from there, you need to queue it somewhere or keep it in the chip
and raise a flag or equivalent; then the main loop will deliver the
frame to lwIP. You can not have frames delivered to lwIP on interrupts
and call lwIP to send frames on the main loop without wreaking havoc.
Your raw_data_current structure might get corrupt by a wandering
pointer, you could add an assert on raw_data_current->buffer before
calling tcp_write() or you could add breakpoints or print it out to
check it is where it should be. You can also enable
Please verify these and by all means learn your hardware and decode all
the information in the exception, because that will let you decide what
to do next.
And I read your first mail again... your serial handler has a big
problem and is trashing memory, your debugger is telling you that with
color signs and bells, you already found the culprit:
"I also notice that when the problem occurs the buffers (in the double
buffer) I use in the receiving interrupt routine of the serial interface
point to an address outside the allowed region"
Once you verified you are using lwIP properly, fix that first.
How ?
Well, put breakpoints, log the pointer addresses, simulate the code,
execute step by step, use the MPU, ask in the forum.
Why ?
Maybe your code is OK but the pointer gets trashed by another function,
which causes the handler to trigger the exception. In such a case,
you'll have to find who is pestering your pointer. The more common cause
is arrays out bounds, trying to fit 20 elements in a 15-element array
and not having noticed... but your mileage may vary. One tactic I use
mostly is to disable all probable functions and enable them one by one.
If you have a good debugger, you could trigger on accesses to the
pointer address, and check wether those are from your function or
"someone else", or you could program your MPU to detect that and trigger
an exception. Again, ask in the forum or "hire an engineer" ;)