Monday, August 20, 2018

C64Net WiFi Modem Filter State Machine

So, believe it or not, work continues on the C64Net WiFi Modem firmware.  Especially since the same firmware is being used in a traditional RS-232 version of the modem based on the ESP32 module.




Some of the newer features include:
  1. New AT+CONFIG configuration menu
  2. Ability to set the hostname
  3. X-Modem and Z-Modem downloads
  4. AT+SHELL to access an SD-card interface.
  5. NTP client with configurable timezone.
  6. New configurable socket filtering state machine.
That last feature is what I wanted to document here.

The use case was the ability to have the modem filter or transform bytes coming either from a socket connection, or from a web page via the AT&G command.  There were already existing commands to mask out specific bytes, but users needed something more complex.  I searched the web the best I could to find an existing language definition for doing such a thing, and couldn't come up with anything.  I therefore chose to invent a really simple filtering code/language.  

My requirements were that it had to be completely definable in ascii, using only characters available for the AT command set in quotes.  It needed to be as compact as possible for memory constraints, and needed to handle cases like filtering out everything inside html comments <!-- -->, or possibly filtering out everything NOT inside html comments.

Here is what I came up with:

State Machine entry format:
MMcCCNN
MM - byte value to match, in hex. The value 00 matches ALL.
c - Command character: e)at char, p)ush to que, d)isplay char, r)eplace char, q)ue display and empty, x)empty que
CC - if c == 'r', then hex value of replacement byte
C - if c != 'r', then same as 'c', or '-' to do nothing further.
NN - next state, in hex, starting with state 00.

The machine starts with state 00, and, for each character byte, increments the state until a match is made, at which point is executes commands and proceeds to state NN.

Example:
Suppose you wanted to filter out everything in a web page EXCEPT the contents of the comments.

Important chars and their hex values:
< 3c
! 21
- 2d
> 3e

So, to grab only the stuff from <!-- -->, your state machine would look like this:
## -CODE--  COMMENT
00 3Ce--02  <-- if a '<' go to state 02
01 00e--00  <-- anything else, ignore it, go back to state 00
02 21e--04  <-- if '<!', go to state 04
03 00x--00  <-- anything else, ignore it, go back to state 00
04 2de--06  <-- if '<!-', go to state 06
05 00x--00  <-- anything else, ignore it, go back to state 00
06 2de--08  <-- if '<!--', go to state 08
07 00x--00  <-- anything else, ignore it, go back to state 00
08 2dp--0a  <-- now inside the <!--.  If '-', then state 0A
09 00qd-08  <-- anything else, display que & char, go to state 08
0a 2dp--0c  <-- if '--', then que the char, go to state 0C
0b 00qd-08  <-- anything else, display que & char, go to state 08
0c 3ex--00  <-- if '-->', dump the que, ignore char, go to state 00
0d 2dqd-0a  <-- anything else, display que & char, go to state 08


So, to do the AT&Y command, we just combine the codes in order:
AT&Y"3Ce--0200e--0021e--0400x--002de--0600x--002de--0800x--002dp--0a00qd-082dp--0c00qd-083ex--002dqd-0a"

Then any subsequent packets received from an open socket, or from the AT&G command (which dumps a web page to the modem) will use the above filter.

A few extra utility arguments were added for convenience:
AT&Y     with no arguments clears the state machine definition entirely
AT&Yn   where n is a decimal number, will set the state machine state.

All of this will be in 3.4 of Zimodem.

No comments:

Post a Comment