Posts from 2017

  • On the migration to Python 3

    Recently, the DMOJ judge codebase has been migrated to Python 3, thanks to the combined efforts of me, Xyene, and kiritofeng. Many issues, such as unicode handling were exposed in the process.

    Since Python 2 is still in heavy use, at least in the deployment of the DMOJ judge, compatibility with it must be maintained. This necessitated writing code in such a fashion as to be compatible with both Python 2 and Python 3. The six library has proved tremendously helpful in abstracting away the differences, some of which highly non-trivial.

    For example, six.with_metaclass hides away the difference in metaclass use. In Python 2, the __metaclass__ class member defines the metaclass used for the class, while in Python 3, one would specify it as class Class(metaclass=MetaClass). The latter would be a syntax error in Python 2, and the former has no effect in Python 3. six provides a solution that is highly non-obvious and yet works perfectly.

    The most frustrating part is unicode-handling. The DMOJ judge was written somewhat sloppily in regards to unicode handling, dealing mostly with bytestrings and raw bytes. With the separation of bytes and str in Python 3, strings in the judge must be turned into either bytes or str on a case-by-case basis. It is decided that source code and program output will be treated as raw bytes, and textual data that are derived from these will be handled as UTF-8.

    (Read more...)
  • Installing Debian ARM64 on Raspberry Pi 3 with WiFi

    Most users are probably using Raspbian on their Raspberry Pi 3. However, Raspbian is designed for all Raspberry Pi devices, back to the original Raspberry Pi, which is ARMv6 with an FPU. This does not take advantage of the 64-bit support on the ARMv8 CPU on the Raspberry Pi 3.

    Debian has offered ARM64 support for a while, and being the base distribution for Raspbian, is quite similar. Conveniently, there is a pre-built Debian image for Raspberry Pi 3. You can download it and copy it to a SD card, and it should work out of the box.

    On Linux, the simple dd command showed on the Debian Wiki works. On other platforms, notably Windows, Etcher is reputed to work well and has an easy interface.

    The one flaw with this image is that the WiFi does not work.

    Update: The 20180108 image now works with WiFi out of the box. The following instructions are no longer necessary.

    (Read more...)
  • ARM Assembly: ∞ Ways to Return

    ARM is unusual among the processors by having the program counter available as a “general purpose” register. Most other processors have the program counter hidden, and its value will only be disclosed as the return address when calling a function. If you want to modify it, a jumping instruction is used.

    For example, on the x86, the program counter is called the instruction pointer, and is stored in eip, which is not an accessible register. After a function call, eip is pushed onto the stack, at which point it could be examined. Return is done through the ret instruction which pops the return address off the stack, and jumps there.

    Another example: on the MIPS, the program counter is stored into register 31 after executing a JALR instruction, which is used for function calling. The value in there can be examined, and a return is a register jump JR to that register.

    ARM’s unusual design allows many, many ways of returning from functions. But first, we must understand how function calls work on the ARM.

    (Read more...)
  • private and final fields: Can you actually hide data in Java?

    Sometimes, after many attempts, you realized that to complete your mission, you must access private fields, or perhaps change final fields.

    There are many reasons imaginable: the accessors copy the entire object before returning, and that takes a very long time, the authors forgot to provide an accessor, the library function is highly inefficient and you need to do better, …

    Are you out of luck? Fortunately, no.

    (Read more...)
  • Online Judging Sandbox: From Linux to FreeBSD

    As most probably know, DMOJ uses a sandbox to protect itself from potentially malicious user submissions. An overview of the Linux sandbox has been published by my friend Tudor. However, it doesn’t go deep into the implementation details, many of which differ between Linux and FreeBSD.

    At its core, the sandbox, cptbox, uses the ptrace(2) API to intercept system calls before and after they are executed, denying access and manipulating results. The core is written in C, hence the name cptbox.

    Perhaps the most obvious difference between Linux and FreeBSD is that on Linux, ptrace(2) subfunctions are invoked as ptrace(PTRACE_*), while on FreeBSD, it is ptrace(PT_*). But this difference is rather superficial compared to the significant internal differences.

    (Read more...)
  • Effective Assembly: Bitwise Shifts

    Most people, when first starting assembly, still carry over a lot of high level constructs in their assembly programs. A common pattern is to multiply and divide when a bit shift would suffice.

    For example, a lot of people would write a program to write out the binary representation of an integer using the divide and modulo operations. This is rather inefficient compared to using shifts. For example, the divide by 2 can be replaced with a right shift by 1, and modulo 2 can be replaced by a bitwise AND with 1.

    Aside: interestingly, taking any number modulo a power of two m is equivalent to doing a bitwise AND with m-1. The proof of this is left as an exercise for the reader.

    This post will address the basics you need to know about shifts to get up to speed on writing good assembly.

    (Read more...)
  • A new "Hello, World!" for C

    Most of us have a good idea how to write a simple “Hello, World!” program in C, but sometimes it feels a little too easy. Luckily, we can always make it more of a challenge!

    Consider a hypothetical situation where many symbols are banned, such as ", ', \, #, {, and }, and we aren’t allowed the string Hello, World! as a subsequence in the code. How would we write a “Hello, World!” program then?

    Is it impossible, because we can no longer use {} to write a block of code for a function? Is it impossible, because we can’t actually embed the string?

    (Read more...)
  • Build an interactive C++ Jupyter notebook via Cling

    Jupyter and IPython makes for a very nice notebook, but by default it comes only with Python support. Fortunately, Jupyter supports many kernels, allowing for many languages from R to Redis, Perl to C++ to be supported. Unfortunately though, getting these kernels to run is not exactly an easy business. This time, we will be dealing with cling, a Jupyter kernel for C++.

    (Read more...)
  • Using the Visual C++ compiler on Linux

    It is a fairly common practice to compile Windows application on Linux build servers. However, this is usually done through an approach called cross-compiling. The essence of this approach is using a compiler for Windows applications, but the compiler itself is a Linux application. Usually, the compiler used for this is MinGW (or MinGW-w64 these days), a GCC implementation for Windows.

    This works great when porting traditional Unix applications to Windows, since it meshes nicely with the traditional build system on Unix-like systems. But it is rather poor for standalone single .exe applications, which are more common in the Windows world. MinGW has a few DLLs that are needed to run the applications it compiles, and that ruins the single executable experience.

    The traditional way to build these simple applications in the Windows world is with the Microsoft compilers, usually in the form of Visual C++. These compilers are fairly nice, but they have one problem: they do not exist as cross compilers. (Well, they can cross compile between different processors, but the compilers themselves will only run on Windows.) What do we do then? Do we resign ourselves into not having single executable applications, or do give up and buy a Windows build machine?

    (Read more...)
  • A polyglot header for Python and cmd.exe

    After seeing Raymond’s post on polyglot launchers for Perl and JScript with batch files, I decided to present one for Python:

    @python -x "%~f0" %* & goto :eof
    # Your Python code here.

    This one simply use the special python flag -x to ignore the first line, which is somewhat analogous to the -x Perl flag, but much simpler.

    I also have an alternative Perl polyglot header that does not require the special flag -x.

    @rem = '--*-Perl-*--
    @perl "%~f0" %*
    @goto :eof
    ';
    undef @rem;
    # Your Perl code here.
    (Read more...)
  • Onion Sites

    I am probably somewhat obsessed with having my websites accessible over a .onion domain, perhaps because I like vanity names (I’ll explain this later).

    A while ago, I introduced dmojsites2fpbeve.onion for DMOJ. And today, I introduce quantum2l7xnxwtb.onion for this website.

    These .onion websites are accessible over Tor, and do not ever leave the Tor network when accessed this way. Despite not having HTTPS (which is basically unattainable due to the lack of any certificate authority willing to issue free certificates for .onion), the encryption is end-to-end: only your computer and the server at my end can see the actual traffic in plaintext. For those familiar with the Tor network, there is no exit node which can watch your traffic in this setup.

    To preview these websites, you can use tor2web.org. In practice, you simply have to append .link after any .onion domain, and tor2web will take care of the rest. For example, quantum2l7xnxwtb.onion can be accessed as quantum2l7xnxwtb.onion.link. Note that you lose pretty much all the benefits of Tor this way.

    .onion domain names are composed of 16 “random” alphanumeric characters (more precisely, matching ^[a-z2-7]{16}\.onion$). These are derived from the public key of the onion site. Now, you may have noticed that the DMOJ and Quantum onion sites have a nice, identifiable prefix. This is called a vanity name. To generate these, we perform the equivalent of generating keys until we happen to get the desired prefix. This process is not too fast, as you can probably imagine.

    (Read more...)
  • Getting a perfect score on the SSL Labs Server Test

    I decided to take it as a challenge to get a full perfect score on the de facto standard of SSL implementation quality, the Qualys SSL Labs Server Test.

    Needless to say, getting a perfect score is not without cost. For example, many browsers will be incapable of accessing the site. For this reason, I decided use a “disposable” domain name: ssl100.quantum2.xyz, which also runs on a separate IPv6 address to prevent any contamination on this website (there is no IPv4 since I didn’t have a disposable address), so you will need IPv6 access.

    A screenshot of Qualys SSL Labs Server Test showing ssl100.quantum2.xyz getting a full score.

    Incidentally, this also gets an A+ on securityheaders.io.

    (Read more...)