What I want to emphosize is, I did all these stuffs in less than a full week, and then a working edition is there. I've not counted the code, maybe a few thousands line of python, coffescript, css and html code.
Here comes the problem - I can never be so productive when writing code for the datapath of JUNOS, or any embedded systems I've been working with. Why is it?
Then I began to compare the differences for my web application and the data path application, say NAT for embedded systems, to find the magic.
Yes you can argue the embedded systems are much more complicated, but making not so complicated things complex is much easier than making it simple, right? If you think about a whole web systems, you need to deal with web server, db server, message queue server, cache server, etc., and each of them expose every detail to you and you have to learn all of them, who can write web applications so easy?
You may also argue that performance is everything so we had to sacreface almost everything. But is it really the right way of doing it, considering the hardware is getting a lot better and better than 15 years ago? 15 years ago, web applications are also written in C or equivalent, in terms of performance I guess? But what about now? Our mind should change with the era. Furthermore, is it easier to write the architecture right firstly then optimize it afterwards? Or it is easier to optmize the application firstly even if making a lots of things in a mess then evolve the whole mess? The answer is transparent.
So here's my point - let's try to make the architecture right with a framework with usability for developer bear in mind.
Here I'll try to come out a crazy data plane framework. I don't even know if it works but it is a good stress test for your brian when you're in a 12-hour flight with nothing else to do.
Engine is a very light-weight component like thread, but the memory footprint is much less and there's no data copy between engines. To boost the performance, multiple instance of the same engine could be run simutaneously.
Engine is the minimum unit in the forwarding path, following open-close principle. An engine should do and only do one thing, it usually should not be changed when introducing a new feature. For example, you should not create a l3 forward engine which combined lots of stuffs in it. Instead, you should do something like a TTL engine, which just decrease the TTL of the given packet.
An engine have an inqueue and outqueue to hold packet to be processed and to be sent to the next engine. The writer of the engine usually isn't aware of it. To move packet forward to the next engine, current engine just need to call API like this:
This API will automatically calculate the next-to-call engine, and distribute to one of the not-so-busy instances of the engine.
To write an engine, you basically need:
Commandand implement methods like
debug. Configuration is stored into memory based key-value database.
s2c(pkt). They will be called by
process(pkt)based on the traffic direction.
Path is a set of engines that packet will go through. Usually the first packet packet will trigger a session installation, which creates bidirection paths for the session.
Path is organized as bitmap.
next_engine() API will work with path to decide the next engine, but the engine owner doesn't need to deal with path.
Session is almost the same concept as what a typical firewall session is. It's 5-tuple based, bi-direction data structure that provides enough information for engines to process packets.
Sessions are stored in a memory based key-value database. It can be queried and modified by database API.
After session lookup, each packet data structure will have a copy of the matched session. Except invalidation, normally engines should not modify sessions in database. Only session lookup engine could modify session - e.g. the sequence number, the statistics info, etc.
There are two session classes:
SessionStore has the static
match method, if you inherit
modify_on_match() should be implemented.
SessionStore will be separated based on protocols.
execute(), which will call the engine path attached to the direction that packet comes.
decap(pkt). So you need to implement
There are much more to consider, for example, TCP Proxy, ALG, IDP, AI, QoS, etc. But unfortunately my flight is almost over and I need to release my brian for something more relex.
DPDkit can zero-copy the packet from driver to user space. This is a great news for this idea.
I'm struggling for a while about the langyages I should choose. To me, golang is too young, python/ruby is too simple, and c/c++ is just naive to do it. Erlang, on the contrary, seems to be a smart choice for it.
I know little about erlang. You can see my previous psudo code are all in c/python. I don't even know if erlang supports OOP (I guess so). What I think it is the right choice is because:
So the next step is to learn erlang, and to try to write a framework, which by adding an engine, I can make the basic pass-through TCP traffic work without any issue.
Don't laugh at me if you're expert. It is not an architecture spec. I just let my thought fly and record it faithfully.