The Official Unofficial Zorp project

Stacking is the process of combining proxy modules in a way which can handle complex protocols. One example is https, which is http inside ssl. You can handle this by using pssl proxy for the ssl part, and then the http proxy for the http part.

3.2 What does stacking under and stacking sideways means?

If you stack proxy B under proxy A, then the data stream is handled by proxy A, and then by proxy B, and then by proxy A again. In this case proxy A should know what to do before passing the stream to proxy B (how to modify it), and what to do after getting it back from proxy B.

For example, when you give a http proxy class as the value of the stack_proxy attribute of a pssl proxy class, it means that the pssl proxy should give the stream to the http proxy after it unpacked the ssl part, and should apply ssl encoding to the stream after got it back (if both sides of pssl should use ssl).

Stacking two proxies sideways means that the stream goes fully through proxy a, and then proxy b. In this case none of the proxies should be aware that there is another proxy in the chain.

For example you have a https connection, and you want to connect to different servers with plain http based on server name in the URL. You could do it with the http proxy, if it would not be packed into ssl, but with http stacked under pssl, it is not easy to control the destination where the pssl connects from the http proxy. Now you can stack a http proxy after the pssl in the following way:

client -> pssl -> http -> server

You do it with SideStackChainer, like this:

Service('blabla', MyHTTPS, router=InbandRouter(),
        chainer=SideStackChainer(ExtMyHTTP))

3.3 How much memory is eaten by Zorp proxies?

It is mostly proxy dependent. BalaBit uses the following parameters for sizing:

Zorp eats 8-10 MB in shared mode (binary, proxy, modules, libs), it does not depends on the number of instances
some 8-10 MB is eaten in non shared mode by each instance, for several caches
Each proxy thread uses memory for state and buffers. The memory used to store states is at most 16kB, and the buffers are controllable. In case of plug the default is 2x1500 byte if there is no stacking, and 3x1500 if there is. If you raise buffer_size, it raised, of course.

3.4 Would like to get some more details about the performance of Zorp and connection setup rates.

I see the web-proxy throughput mentioned in the file Zorp2.pdf.

Those numbers are the results of labor testing and as we all know (except for the marketing guys :) labor tests never really measure real life. We used "ab" (apachebench, bundled in Apache webservers) to generate HTTP requests through router/packet filter+NAT/Zorp to a custom web server (not really a webserver, it is just a program which understands HTTP and returns static content).

Our results clearly indicate that session startup time is much worse than for packet filters but as soon as the proxies start running throughput is quite good when the number of parallel sessions stabilize (ie. not many new/closing connections)

Speaking about real life, we are using Zorp in the following scenario: - about 10000 users - Four Pentium IV Xeon 2.4Ghz, 2GB RAM, SCSI disks - load balancing equipment to balance load accross the four firewall boxes - mail traffic is relayed (this results in lots of disk I/O) - about 15GB log each day

The system is stable for about 100MBits of Internet traffic (95% HTTP sessions), about 30000-40000 sessions/minute. It is important to note that Zorp supports HTTP keep-alive, therefore the number of connections is lower than the number of URLs fetched.

We tried to overload a single box just to see where the limits of a single box configuration is, with a widespread e-mail virus active at the time, it could handle about 16000 connections per minute. I think without the load generated by the mail system (postfix) we could achieve 18000-20000 connections per minute.

As we profiled and tuned the system for a couple of weeks I'm confident that about 90% of the load is caused by session startup/teardown.

We don't really have similar, real-life performance numbers for SSL. Zorp uses openssl and as such it is a possibility to use crypto accelerator cards, though this is currently not supported (because of the lack of customer demand).

We are currently evaluating a technology that could increase our performance even more, using a custom kernel module. In our experience these kernel extensions can increase proxy throughput significantly. (copying files from kernel space is about ten times faster than doing the same in userspace). I think raw throughput (e.g. without the proxy startup time) can be increased by 100%.

3.5 How can I serve a huge number of concurrent threads with the smallest possible delay?

The solution is to split your single Zorp instances to smaller instances working on the same set of connections. This can be achieved by running for example 16 instances of HTTP listening on different ports. (for example 50080 - 50095) then use 16 packet filter rules to distribute the load between processes based on source port for example.

How this can be achieved:

def define_services():
        Service("http", HttpProxy, ...)

def instance1():
        define_services()
        Listener(SockAddrInet('1.2.3.4', 50080), 'http')

def instance2():
        define_services()
        Listener(SockAddrInet('1.2.3.4', 50081), 'http')

def instance3():
        define_services()
        Listener(SockAddrInet('1.2.3.4', 50082), 'http')

etc.

You can either use the stock --sport match with ranges to distribute the load, but it's better to use u32 where you can do things like: source port module 16 decides which listener to redirect to.

iptables -t tproxy -A PREROUTING -p tcp -m u32 --u32 '0>>22&0x3C@0>>16&0xF=0' -j TPROXY --on-port 50080
iptables -t tproxy -A PREROUTING -p tcp -m u32 --u32 '0>>22&0x3C@0>>16&0xF=1' -j TPROXY --on-port 50081
iptables -t tproxy -A PREROUTING -p tcp -m u32 --u32 '0>>22&0x3C@0>>16&0xF=2' -j TPROXY --on-port 50082
iptables -t tproxy -A PREROUTING -p tcp -m u32 --u32 '0>>22&0x3C@0>>16&0xF=3' -j TPROXY --on-port 50083

and so on. Creating 16 processes will probably suffice.

We have somewhere between 500-600 new connections/sec distributed on 4 computers running 16 processes each. And latency is ok.

3.6 How can I redirect the session to an arbitrary destination?

Use the SetServer method of the session. In a proxy class you usually do something like

self.session.setServer(SockAddrInet('2.3.4.5', 80))

See destination.policy for a simple example, and virtualhost.policy to see how you can set up virtualhosting with Zorp.

3.7 How to define a zone with an irregular IP range?

I have to define a zone with an IP range which is not a regular subnet? Should I define one zone for each "C" subnets?

Lets' see it through an example: your address range is 1.2.50.0 - 1.2.70.255. In the simplest case, you define one zone with enumerating all of your subnets:

InetZone("myzone",["1.2.50.0/24","1.2.51.0/24",...,"1.2.70.0/24"])

The three dots in the above example means that you actually have to write out all 20 of your C subnets. Some subnet computation will show you that the address range can be split to 6 nonuniform subnets:

InetZone("myzone",["1.2.50.0/23", "1.2.52.0/22", "1.2.56.0/21", "1.2.64.0/22", "1.2.68.0/23", "1.2.70.0/24"])

Another trick is the use of umbrella zones. For example to describe the fact that 1.2.0.0-1.2.120.0 is your intranet, and everything else is the internet, you can define the following zones:

        InetZone("internet","0.0.0.0/0")
        InetZone("intranet","1.2.0.0/16", umbrella=TRUE)
        InetZone("internet_too",["1.2.128.0/17","1.2.121.0/21"],admin_parent="internet")

The above means: all addresses are internet, but 1.2.0.0/16, which is intranet except "1.2.128.0/17","1.2.121.0/21", which is internet instead.

3.8 The ftp proxy says: "Possibly bounce attack" what did happen?

The command- and the data connection of the ftp session wanted to use different IP addresses. Zorp considers this malicious activity. Ftp bounce attacks are used to:

overcome restrictions on where can files be downloaded.
remote control things to make harder to find out the original source. It is used for sending spam, for example.
to circumvent firewall rules. It is made harder by the fact that data connections are unidirectional in Zorp :)

3.9 Zorp bails out with no useable logs even when I bump up log level. What to do?

In some situations when logging to syslog zorp doesn't wait to flush all messages at exit.

You can tell zorp to log to console instead of syslog with the -l switch.

Unfortunately zorpctl drops all messages written to console, so you must start zorp without zorpctl. You can do it with the following command line:

        /usr/lib/zorp/zorp --as <instance_name> -l

It writes to the console what the problem is.

If you find such a situation, please issue a bug report describing the logs which was not flushed.

3.10 Does the gpl version of zorp offer any authenication? I want to auth against my openldap server.

Some parts of the authentication infrastructure are missing in zorp-gpl, but you can do authentication.

The first step is to get authentication data. You can either do it using a protocol element like http proxy auth header or ftp username/password, or by wrapping the protocol to ssl, and use the cert for authentication.

With http and ftp you can use the upstream proxy attributes to figure out. With pssl, you should use the certificate DN. It is exposed to the python level in zorp-unoff, but not in zorp-gpl yet.

The second step is to check the authentication data against some database. You can use ldap (I did it myself also), by using the ldap python class in the usual way. Don't forget to install python-ldap.

3.11 How can I control protocols which Zorp does not have a proxy for?

If certain conditions are met (the protocol uses a limited number of known tcp/udp ports), you can let it through using plug proxy.

If you know what do you want to filter in the protocol, and certain other conditions are met, you can even filter it using anypy proxy.

3.12 How can I use bandwith_to_client and bandwith_to_server to control bandwidth?

These variables of plug are read-only ones. You can use packet_stats_interval_time and packet_stats_interval_packet to set the interval between two packetStats() event. If you set one of them, then you should have a method called packetStats() which does something if the bandwidth is wider than you want. See usr/share/zorp/pylib/Zorp/Plug.py for more info.

3.13 Do you have an ssh proxy?

Not yet. Balabit has said that they will do it if there is enough market demand.

How can I protect my systems from password scans?

See ssh.policy for an example.

Next Previous Contents

3. Non-http related questions

3.1 What does stacking means?