Map and projects (the most frequently updated page of this blog)


Ne trouvez-vous donc pas, l'histoire un peu répétitive ?

a simple API jump
If you check Wine's source:
void WINAPI RtlCopyLuid (PLUID LuidDest, const LUID *LuidSrc)
*LuidDest = *LuidSrc;
you see that this little NtDll API is very strong: no check is done, so it could be used literally for anything. A simple way to use it is just to jump, by setting the right arguments.


...Weiß noch nicht, dass er tanzen muss

misc update
Just to let you know I updated the Map and Downloads, hoping things will be a little more detailed about my various experiments.

petite mise à jour
Simplement pour vous dire que j'ai mis à jour la Carte et les Téléchargements, en espérant que les choses soient un peu plus claires concernant mes bidouillages.


Sans réfléchir, ne me demande pas comment...

a bit of nostalgia
my first game crack was purely accidental (i own the game too!):
both edited by Microprose, F15 Strike eagle II and F19 stealth fighter had the same structure:
a small (F15|F9).COM file calling the main GAME.EXE file
what if you swap one game's .COM with the other game's ?


Wir halten zusammen, keiner kämpft allein

Finished PE/Packers/Opcodes graphics

As I added Data Directories to the PE infographics, my 3 infographics projects are now finished:


Before you judge me, take a look at you

Packers' algorithms

I created one last diagram, showing Packers' most common algorithms.


Si tu cherches un peu de gaîté, viens donc faire un tour à...

Typical behavior of the various kinds of packers

I made an infographic showing 3 different kinds of packer, their usual steps and the caracteristics of each of these steps.


I time every journey to bump into you

PE file and memory layouts
I created a graph (diagram?) for the PE format, showing 'standard' layouts of a PE file, on disk and in memory.


Lutte contre les mots faciles, lutte contre la haine des ...

user-mode opcodes cheat sheets
I mostly work on user-mode code, or kernel-mode code that actually uses a very limited amount of privileged opcodes, just to access CR0 and IF. Besides, FPU/MMX/SSE are usually used as junk or pure calculation that I can ignore.

So, from that limited perspective, the amount of opcodes is much reduced.

A Perspective of two-byte opcodes
After my overview of one-byte opcodes, I made a graph of two-byte opcodes according to that perspective.
It makes it much more readable than expected!

Opcodes' reminders
Also, I checked every user-mode opcode, and wrote a one-liner to describe them, as well as a small example. I put together an executable with all the examples, just to see them in action - and test your favorite emulator ;)

It makes them small opcodes' reminders, in printable text and executable code formats.


If you wanna make the world a better place, take a look at ...

typical packer entry-points
It can be useful to have a reminder of the most usual packers' entry point - especially the light ones, which are likely to be hacked or used as an inner layer.

Quand mes 'elles' se froissent et mes 'ils' se noient

pages on anti-debuggers and PE oddities

I created 2 new pages: one is about anti-debuggers (nothing new, just a compact and a printable form), and the other about PE oddities.


Life's a piece of sh.t, when you look at it

Overview of one-byte opcodes

I made a simple one-page overview of one-byte opcodes:


Aurais-je été meilleur ou pire que ces gens, si j'avais été...

packers' categories and features
Following my graph of the packers' landscape, I made a graph showing the different categories of packers, and what kind of features they have.
Then, to go deeper in details, I made a more detailed list of the various features for each of these kinds.


You can't hide nowhere, with the torchlight on

a emptier TinyPE
TinyPE is an impressive project, that explains step-by-step how to make an incredible 97 bytes functional PE. It also shows that a PE can't be any smaller, otherwise IMAGE_OPTIONAL_HEADER32.Subsystem, which is a critical field, wouldn't be defined - it's even already shortened from a word to a byte.

However, the original TinyPE still defines a section and SizeOfOptionalHeader, which are not necessary.
Removing them makes such a PE not only Tiny, but also very small in amount of information - yet it works, naturally, and there's quite some room for code (relatively).
In the end, here are the only defined fields, across all PE headers fields:


Si c'est ton corps qui bouge, c'est ton coeur qui fait tout

Getting the current EIP
While standard code starts at a fixed address, there are several cases when your code needs to know its current IP:
  • after a vulnerability has been triggered, shellcodes can't know in advance where they are executing exactly
  • packers often allocates a buffer and decompress their next layer of code, which will likely need to locate itself at some point
  • relocating code is a good way to avoid breakpoints: same code, somewhere else

Thus, I'll enumerate ways to get your current EIP, in a file, on which you can test your emulator or debugger.


no, I'm your father

misc news
Opcodes 'complete'

my file listing all known 32 bits opcodes is almost done: everything documented should be in (including AVX, XOP, Padlock, LWP), and 99% of undocumented stuff I can think of is in (to be blogged later)

Packers graph now printable


And when I start to come undone, stitch me together

Exception triggers
Structured Exceptions Handling is a complex mechanism that makes many anti-debuggers / anti-emulators possible. After setting a handler (check Subtle SEH for exotic ways, but never used in the wild), you trigger the exception. And typically, packers rely blindly on the trigger itself, such as the actual error code: in short, trigger the wrong exception, and execution will fail (tampering is deected).
The most common ones are:
Int3 BREAKPOINT 080000003h
mov [0], ... ACCESS_VIOLATION 0c0000005h
But what about the rest?

I put together common exception triggers. There is no point listing all of them and all possible triggers, just common ones found in packers or malware, or the ones with a non-obvious behavior.

Access violation

This is probably the most common one, as it can happen 'naturally'. Access a wrong address, and it will trigger. Note that would also happen on trying to write a readonly address.
Also, most interrupts, including CD01 Int 1 and CD20 int 20h, will trigger this exception. This is different from F1 IceBP, which is sometimes written Int1, and triggers a Single step exception, and Int 20h used to be for VxdCalls under Windows 9x, so this is not relevant today anymore.


I have legalised robbery - called it belief

Real life security (fails?)
How secure is your network if your front door is wide open?


Like software protection (and, say, virtualization), a different design in locks can bring added protection and extra features. If you are interested in your own front door security, I advise reading about the fascinating Geminy Lock and Abloy articles here. It's impressive to see that a Geminy withstood more than 30 minutes of continuous attacks, and that an Abloy can have 2 different keys (one to open and one to close).
Also, even more fascinating (analysed in the 'Abloy special products' PDF), the Rosengrens RKL-10, which is resettable: lost your keys? order a new set and reset the lock! But this unique feature doesn't make it necessarily weaker.

Hotel room

Following the video about the Dubai assassination,


Combien d'échecs avant que l'on comprenne? et d'autos brûlées, pour voter...

Libdasm downloads
I don't have much time lately for Libdasm, but it's not a reason to ignore it totally.
I think it was a bad idea to remove downloads (and binaries?), let me know what you think!
On one hand, it's handy to just have source in one click, but on the other hand, I don't think it's that relevant to bundle binaries,
because they depend on your compiler or OS, and if you code, well, you're likely to have the compiler (and it compiles quickly and easily).

Let me know what you think, by commenting or replying in the group

In the meantime I added the original archive, and the current source as downloads.


None can outrun or equal ... the power ... of Megablast

a graphical representation of the packers landscape
I created a graphical representation of the packers' landscape. It's certainly far from complete (could it ever be, honestly ?), but it might be useful to you.

Comments are welcome!

PDF svg


Just remember, it's not so long since you were young

English only?
I removed the previous poll since it looks like it was not worth it - my twitter will stay suspended ;)

However, now I'm asking you if you're ok with this blog being bilingual, English first then French.
It might annoy English reader, or fustrate french readers. Or maybe French readers only read the english part.
the 3 possible conclusions of that poll would be:
- keep as is
- split french into another blog
- remove french

the poll is on your right.



Drivers in user-mode

Ever wanted to trace a driver directly from OllyDbg, without the usual
Unable to start file 'driver.sys'


I already introduced the basics of a driver, at PE level.
It might be interesting to run a driver in user-mode, for example, to unpack it:
On one hand, if a driver is packed, you just won't be able to quickly run and dump it the usual way, so you'd have to use a kernel debugger.
On the other hand, typically, packed drivers unpack themselves with no or few API calls, no or few privileged instruction, which makes you think:
'this is standard user-mode code that just runs inside a driver to unpack itself, if only I could just run it the usual way'.

Loading the driver

But there are 2 things that prevent Driver.sys from loading under a user-mode debugger:


Militant quotidien de l'inhumanité

TLS and Imports

When is an apparently incorrect TLS entry actually executing a file ?

I already introduced TLS:
before execution of the Entrypoint, each callback is taken as is - since it's a VA - and executed, until a null entry or an exception occurs.

linked to Imports

But if you make the callbacks point to one of the imports:
AddressOfCallBacks dd __imp__WinExec


If you want to strike me down in anger

Messing with loops
Do you understand these snippets?
setz ah setnz cl
aad 11 xor eax, eax
add eax,04000f3 mov fs:[eax], esp
jmp eax ror cl, 01

the problem

When reversing a program, fast forwarding by skipping loops is important - no one wants to step through each iteration. Also, detecting loop behavior is important in emulators, especially when extra loops are inserted to make them time out.

Let's take a simple example:


You're so fine, lose my mind, and the world seems to...

I moved non-technical blog entries here to keep this blog coherent.

J'ai mis tous les posts non techniques ici pour que ce blog reste cohérent.


Et puis celles qu'on doit pas...

Undocumented opcodes and behaviors

Ever seen this before?
00400181 0F1F ??? ; Unknown command

As my opcode file is now close to completion, I made a working test executable with undocumented or uncommon opcodes, that you could use to test your own emulator or disassembler.

Note that if you use an older tool, opcodes might not be disassembled at all. If you're using Ollydbg (1.1), get a copy of BeatriX' FullDisasm to add support for the latest opcodes.
Let's start:


It's just a flesh wound

Section-less PE file (updated)
You may not expect a PE to be valid without all its standard structure:
Dos Header, Nt Headers, File Header, Optional Header, Data Directories, Section Headers.
TinyPE already proved that the Data directories are not compulsory, but also sections are not always required.

If the alignment is smaller than 1000h (800h or less), and the number of section is null, the loader loads the file directly as-is (RVA = Offset). And since the number of section is null, you don't need a section table altogether.


din nebunia de culori, vreau sa aleg si alte flori

a PE Headers graph
If you're looking for a good representation of the PE format, OpenRCE's poster from Ero Carrera is the standard.
I gave it a try making my own representation, and started a multi-page one, lighter to open, easier to print (A4-formatted), where elements are shown differently depending on their importance.


On aura plus de pain sur la planche, parce que la planche aura brûlé

messing with sections physical offset
With a high alignment (>= 1000h), nothing prevents 2 sections to come from the same physical data.
Thus, if 2 sections with different virtual addresses have the same PointerToRawData and SizeOfRawData, their content will be initially the same. Relocations and imports will be applied afterward though.
.VirtualSize dd Section0Size
.VirtualAddress dd Section0Start - IMAGEBASE
.SizeOfRawData dd Section0Size
.PointerToRawData dd Section0Start - IMAGEBASE
.VirtualSize dd Section0Size *same
.VirtualAddress dd Section1Start - IMAGEBASE
.SizeOfRawData dd Section0Size *same
.PointerToRawData dd Section0Start - IMAGEBASE *same


Policeman got no gun, U don't have 2 run

SMSW based anti-emulator/stepping
SMSW (store machine status word) stores the 16 lowest bits of cr0 in the operand register. In the case of SMSW with a reg32, the highest word is not defined - it seems to be always 8001h, though.

It makes it a weird reg32 opcode (why accepting a 32b operand if you undefine the highest bits and if there is a 16b operand counterpart ?) but it definitely changes the highest word (some disassembler show invariably a word operand, which is wrong).

While 'mov eax, cr0' is a priviledged instruction, SMSW isn't.


The hen never laid and the corn never growed

anti-* with the GS register
On thread switch, the GS register value is not restored (32 bits only).
It's a simple statement that leads to anti-* (debugger/tracing/emulator) that defy common sense. (one of my favorite anti-*, since it doesn't call any API and requires to think out of the box).

When stepping, threads are switched, so your debugger might lose the right value.
Try it yourself:
  1. open debugger
  2. set GS to a non-zero value
  3. step, even once
  4. GS might be zero already!

so it's makes an easy anti-stepping:


You can rock this land, baby

the other subsystems
In your Windows directory, most drivers have many sections, including the PAGE and INIT ones, where the EP is. All this is pretty scary, while, in the end, only a very small amount of information (compared to a GUI PE) is necessary to create a working driver:
as expected, the Subsystem has to be set to NATIVE, then relocations are compulsory since you can't tell in advance where the driver will be loaded, and a correct PE checksum is required to have the driver running.
And that's all!


Sail to the edge and I'd be there

Messing with the TLS
TLS, aka Thread Local Storage, is a way to execute some code before the EntryPoint or after ExitThread/ExitProcess.
the 10th Data Directory points to a structure, and one of the elements (VA, not RVA) points to null-terminated list of callbacks, which will be called one after the other.
This list is stored as VAs (it includes the ImageBase then), which makes it quite uncommon among the PE structures.
AddressOfCallBacks dd Callbacks ; VA
dd TLS
dd 0 ; null-terminated list

The size of the Data Directory is not taken into account. Some tool may ignore wrongly the TLS if it's not defined, though.

Callbacks are executed on (before) thread start and on (after) thread exit. However, (credits goes to Peter Ferrie and Kris Kaspersky here), TLS callbacks execution won't happen if no dll importing kernel32 is imported itself. So, if kernel32.dll is the only 'official' import (it doesn't mean it's the only dll in the program space), the callbacks are not executed.


If you got the money honey, we got your disease

Messing with the EntryPoint
In most files, the EP is in the first section. In many packers or file infecters, it will be in another section. It's actually common in the header itself (Upack, FSG), and sometimes (like - among others - in collapsed.asm), it's at RVA 0, in which case the MZ signature is just interpreted as dec ebp, pop edx, which is benign. Many packers just put some trampoline code at RVA 0, then the rest of the code further.
So, usually:
Section0 VA <= EntryPoint <= Section0 VA + Physical Size
and to a general extend:
0 <= EntryPoint <= SizeOfImage
But no check is actually done on the EntryPoint value!


With a rebel yell! more, more, more!

Description of a compiled PE header
In my previous posts, I started exploring PE Headers with a minimum amount of information (as opposed to the official specifications). On the other hand, standard compilers like MASM add more elements (not necessarily documented), on top of defining, as you would expect, mosts elements of the structures.

To understand things correctly, I assembled and linked a simple HelloWorld code source in Masm, and reproduce the complete structure of the executable with a YASM source (that defines every byte of the header manually).


Hey, hey, hey, what's in your head?

PE Header holes / filling them
Since the PE loader in Windows is too flexible, most of the PE Header information can be discarded.
As the Tiny PE project proved, it's possible to get a 97 bytes PE! It also proved a valid PE can't be smaller, as 97 bytes is the minimum size to fit all the structures until OPTIONAL_HEADER.Subsystem, the last compulsory element.

In my short one-section file header (which I use in my helloworld.asm example), I define a minimum (not an absolute minimum, though) amount of elements of the PE structure, to have a file with Imports, Section and EntryPoint (none of them is strictly necessary):
e_magic (constant)
Signature (constant)
Machine (almost constant)
NumberOfSections (not strictly necessary)
Magic (almost constant)
AddressOfEntryPoint (not strictly necessary)
MajorSubsystemVersion (almost constant)
NumberOfRvaAndSizes (not strictly necessary)
ImportsVA (not strictly necessary)
VirtualAddress (not strictly necessary)
SizeOfRawData (not strictly necessary)
PointerToRawData (not strictly necessary)


They say jump, you say how high

Various ways of JMPing
jumping, aka branching, is one of the most common operations.

I wrote a file that implements many forms of jumping, whether they are common, obfuscated, or rare. Not everything is detailed in this post, check the source for further information.

First, Jumps,

EB 07 JMP SHORT 004000F9
E9 07000000 JMP 00400105
FFE7 JMP EDI ; 00400113
FF25 19014000 JMP DWORD PTR DS:[400119] ; 00400124
EA 32014000 1B00 JMP FAR 001B:00400132
FF2D 38014000 JMP FAR DS:[400138] ; DS:[00400138]=001B:00400145

then CALLs,

E802 CALL 00400103
9A 7C014000 1B00 CALL FAR 001B:0040017C


C3 RETN ; Return to 004001DE
CB RETF ; Return to 001B:004001EC
CF IRETD ; Return to 001B:004001FB, flags = 206



Storm warning, but there's no fear

relocater < mutater < virtualiser
I already wrote about a relocater and different kinds of virtual machines. Between the two of them, there is another kind of executable, simpler than virtual machine but particularly suitable for obfuscation:
a mutater, or polymorphic code.
Similar to virtual machines, some data represents the virtual code to execute. However, in this case, the architecture is strictly the same as the cpu. The main point of mutation is randomization. And if you add some junk code in the middle, you get what happens when virii modifies themselves from one file to the other.


And go where you're going to

To be able to create custom PEs, I wrote a simple script that helps with simple tasks like generating import structures, PE checksum and default values.

So, add all PE structures manually (or better, use the same one over and over), generate imports, and voila! you have a handmade PE file in which you control every byte.

I didn't extend (yet?) that script to Exports/Resource/Relocations/TLS/Sections, because I don't use them so often.
Also, different Section/File alignments are not supported. Once again, I don't really need it (often).

Source directory


PE maison

Pour pouvoir créer des PEs spéciaux, j'ai écrit un script simple, qui permet de faire des petites choses comme générer les structure des imports, calculer la checksum ou mettre des valeurs par défaut.

Donc, ajoutez les structures PE a la main (ou mieux, utilisez toujours le même en-tête), générez les imports, et voilà! vous avez un PE fait main, dans lequel vous contrôlez chaque octet.

Je n'ai pas (encore?) ajouté la gestion des Exports/Resource/Relocations/TLS/Sections, car je n'en ai pas besoin si souvent.
De même, les alignements Section/File différents ne sont pas possibles. Là aussi, je n'en ai pas besoin (souvent).

répertoire Source

Useless but original

An different form of junk code
You probably know about the overlapping instruction technique used to fool disassemblers:
due to the way x86 CPUs work, jumping over a E8 byte will make a bogus CALL instruction appear in the code.

if you use a longer instruction like IMUL, you can fit any instruction, so you can create a blocky piece of code.
So from the outside, whether from hex or from assembly, it looks quite blocky
EB 02 JMP SHORT 004000F4
69846A 40681C01 4000EB02 IMUL EAX,[EDX+EBP*2+11C6840],2EB0040
698468 22014000 9090EB02 IMUL EAX,[EAX+EBP*2+400122],2EB9090
69846A 00E81E00 0000EB02 IMUL EAX,[EDX+EBP*2+1EE800],2EB0000
69846A 00E81900 00005461 IMUL EAX,[EDX+EBP*2+19E800],61540000

while the execution trace looks almost normal:

You'll stumble in my footsteps

A different flow obfuscation: a relocater
I wrote a simple executable, implementing an idea by Piotr Krysiuk, where all routines are made to be executed at the same address. Because of that feature, following the flow is potentially difficult, and creating a direct dump could be annoying as no disassembler allow different pieces of code to be present at the same address.

To give you an example, here are the 2 functions of that binary upon their execution:
004000FA 6A 40 PUSH 40

004000FC 68 6E014000 PUSH 0040016E ; ASCII "Tada!"
00400101 68 74014000 PUSH 00400174 ; ASCII "Hello World!"
00400106 6A 00 PUSH 0
00400108 E8 55000000 CALL 00400162 ; MessageBoxA
004000FA 6A 00 PUSH 0
004000FC E8 67000000 CALL 00400168 ; ExitProcess


when CPUs have too many opcodes...

Back from my last post, to real machines, I decided to release as-is a YASM source that contains most x86 32bits opcodes, including SSE, AVX, FPU,...

My conclusion is that there are way too many!

You can use it just for curiosity or testing your favorite disassembler.

Source Code (Yasm)

the longest opcode (as a word) is
vaeskeygenassist xmm0, xmm0, 0
even though the recent
vbroadcastf128 ymm0, [0]
is not far behind.