ReactOS plans for 2012

January 22nd, 2012

This year, instead of looking back on previous year to sum up what has been done, I will speak about what is planned.

First thing is the long awaited 0.3.14 release. It is on the way to be released and has been branched. This will be an important ReactOS release. We took time to prepare it, and it comes with numerous and great changes. Unfortunately, as several rewrites took place, some regressions are to be expected. And some people may not be able to install ReactOS any longer. In case it happens (in VMs), change your virtualisation software (or downgrade it in case it is Virtual Box). We know that is a real problem. But in spite of our efforts, we could not kick it out. It will require huge work. But, as expressed there are ways to work around.

Most of my work on ReactOS will take place in background. In 2011, I became one of the ReactOS systems administrators. Which means, ensuring servers are running properly and up to date, but also deploy new things to make developers everyday life easier. This year, we plan to deploy two more test bots. One on VMware ESX, another on VMware Player on Linux. Both handled through libvirt and our sysreg tool. I already started implementing the support in our tool.

Another thing will be the (also long awaited) switch to CMake. We will drop support for our own solution (rbuild) to use CMake to handle ReactOS builds. This has been postponed until release and until ReactOS Build Environments (RosBEs) are ready. Both are about to be completed. On RosBE-Unix, I took over on Colin, as he has less time to give to ReactOS. We also have a last build slave running rbuild builds. Once CMake switch will be done, it will be time to tell him good bye. It will be a real switch for the project. The last rbuild builder has done more than 10k builds!

I also plan to deploy another static analysis tool on our server to check ReactOS code quality. We already have cppcheck (with quite huge configuration, by the way…) that returns pretty good results. We also make use of Coverity with success. The idea would be to add another and rather different tool: sixgill. This tool appears to be successfully used by the Mozilla Foundation.

The Foundation also have a Windows Server 2003 license that we would like to use to set up a test environment for developers. That way, they could write tests for functions, run them on Windows 2003, and implement/fix the right way on ReactOS. Method which is also called: test-driven development. Rather useful for us!

Finally, if I have still a bit of time (looks difficult!), I would like to finish all the things I have started/planned last year on ReactOS. But I guess it will not be that easy…

Introducing the PierreFS file system

November 27th, 2011

It’s been a long time since the last blog update. One of the reasons for such an absence was that I was really busy at CERN. During my internship there (at LHCb experiment), my mission was to improve the way their diskless farm is working.

It was previously relying on the Red Hat diskless tools to allow ~15 nodes to run without disk, gathering their OS & their data from another server. But, this system and its implementation by Red Hat is really limited. Furthermore, starting with RHEL6 (Red Hat Enterprise Linux 6), the support for those tools has been dropped. The LHCb was to switch to another method.

One of the solution that was immediately looked at is file systems union. File systems union allow merging several directories (called branches) into one, in a transparent way. Some of those directories can be in read-only mode. Then, changes will only be applied to the writeable directories. So, for LHCb, the underlying idea was to use file systems union to provide easily manageable nodes. One shared root file system would be managed, and then snapshots (one per node) would be merged with shared root into different directories (one per node). And then nodes would boot on their directory. That was, you have each node with its personality and own file system easily. Without the constraints of the Red Hat tools.

We evaluated several file systems union drivers to select the one that was fitting our needs the best. Unfortunately, if several weren’t that easy to deploy, they were even worse to use. They were designed for totally different use (like, for example, LiveCD).

This is why, during a weekend, I designed a new union file system that would match their needs. As I was more creative for concepts than for names, I temporarily called it “PierreFS”. Feel free to find another one. If this file system takes most of its concept from the legacy UnionFS file system, it adds more features that are designed to highly reduce useless/redundant copyups (ie, files that were copied from read-only branch to read-write one, to be editable). One of the major feature of the PierreFS file system is its ability to split data & metadata when performing a copyup. It can handle both separately, to prevent data redundancy and allow easy updates of data, while metadata are modified. It has also a feature that permits the deletion of some copyups when it detected their are useless. Finally, this file system comes with some limitations since it’s targeting specific configurations. It can only merge one read-only & one writeable branches.

All the features described upper (including those from UnionFS) were implemented as a FUSE file system, to test viability of the file system and to validate the concepts. Unfortunately, this doesn’t include all the thought features. Some more like caching were first considered (and are still not dropped yet). The driver successfully passed our tests, and met LHCb expectations.

I’ve started portage to the Linux kernel as a driver for CERN. I hope I’ll be able to release it quickly. This work is once again based on UnionFS team work. This is a good basis to learn and understand.

You can find the FUSE driver at sourceforge.net (no packaging has been done - won’t be done).

This file system has been presented during the ICALEPCS11 conference at Grenoble, and has been described in the WEMMU05 paper & poster (with a comparison with the other file systems). All the conference material will be available on the conference site.

ReactOS & GSoC (and all the rest…)

March 22nd, 2011

The 18th of March has been quite a great day for the ReactOS project. Indeed, on that day, Google released the list of the accepted mentoring organisation for the Summer of Code. And for 2011, ReactOS is in. Last time the ReactOS project was accepted was in 2006. So, this has been something quite magic for the project.

But, you may wonder: what are Google Summer of Code? The idea is simple. Each year, Google will select organisations (ie, Free Open Source Software projects) based on applications they send. Then, those organisations, called mentoring applications, will be able to ask “slots”. Those slots are students that the organisation will mentor on a project, an idea. The important part is that, to ensure students will come (and organisations will apply), Google is offering $5500 per completely finished mission ($5000 go to the student, and $500 to the organisation). That way, students who don’t have any internship can earn money during the summer, and FOSS organisations can earn money as well.

This year, the ReactOS will host 5 slots. Which means that we will welcome 5 five students. You can find some ideas of projects to work on here. But students are free to purpose another subject when sending an application. Students can’t apply yet and have to wait until the 28th of March (before, they have to learn a bit about the projects).

Mentors you may except to have on ReactOS aren’t known yet. But, you’re likely to find Aleksey Bragin in the list (it would be weird if he wasn’t in!). Actually, list will be known once students and projects will have been chosen. This is done that way to ensure that mentors will match at best subjects (and prevent any issue with the mentor). Anyway, several ReactOS developers already applied to be mentor. So no shortage of mentors in sight!

Another topic regarding ReactOS: I’ve been presenting ReactOS at the ISIMA on the 17th of February. Part of the presentation has been shot and you can find the uploaded video here. I may do another ReactOS presentation in France this year, but nothing has been confirmed yet. So, follow the ReactOS news for the moment!

I hope you’ll enjoy the video and the information given in. If you want to get the PDF used for the ReactOS presentation, you can find them here.

That’s all for the moment (nothing related to development yet). More will follow (about dev & GSoC). And don’t hesitate to join the ReactOS project. GSoC are a great opportunity!

Let’s go back to 2010

January 7th, 2011

Indeed, now 2010 is over, it may be interesting to go back to it. Not for the same reasons that led the ReactOS to go back to 2010 (cf: r50529, a revert needed due to a bug on boot image handling). In fact, it is interesting to go back to 2010, just to see what has been done, what was planned, what has failed. Finally, what we can say and remember about that year.

First of all, I would like to speak about my servers. As you may know (or not) I currently own two servers, known as www.heisspiter.net and www2.heisspiter.net, and I also rent a third server at OVH, called www3.heisspiter.net. I was not speaking about them, because I was not really taking care of them. And this led to many and important issues (bad performances, bots, and so on). During a weekend, I decided to switch www3 to ipv6, which finally worked. But, it made me understand that my servers needed love, somehow. So, now, I am working back on them. The idea that they have reached some maturity point and can work alone is definitely wrong.  Indeed, 2009 was already a pretty bad year for heisspiter.net and 2010 was terrible. No evolution, several issues, servers down, … A quick look at statistics show that people also stop coming on the server. And I cannot blame them. 2010 was a really bad year for heisspiter.net, and 2011 cannot be worse. I will just do my best not to make it the same.
An encouraging note regarding heisspiter.net, at least. Some evolutions have started. I was talking about ipv6, but also a webmail for users arrived, upload service is back, mail server has been fixed, servers software have been updated, and heisspiter.net internal tools fixed. This actually explains why heisspiter.net is more stable since December!

Other major point… Of course, ReactOS! You may have seen, reading my other posts that the project did important step into stability and features. Some rewrites, some parts becoming more mature. MM rewrite, with Heap rewrite (for user-mode land) force developers to fix the ReactOS code to corrupt less memory (or to less corrupt memory?). And this works. Especially when fixes are applied to those rewrites. This also comes after some hard year for ReactOS, with no releases, and nothing to release, due to broken trunk. But, here, it is past!
My modest goal, for 2011, is to prove that ReactOS has gained some maturity now. Some testers are already pushing to get 0.4, and I would like to show we are not that far. And I will try to show it on my domain of work on ReactOS, ie filesystems and kernel. With the help of Johannes Anderwald, and Art Yerkes, I attempt to make ReactOS boot from Microsoft FastFAT driver. Johannes has brought some code for tunnels handling in FsRtl, Art code for MCBs and CC. Finally, I come with notifications (still) and motivation to make all that stuff working together (which is not, at the moment). This is a very, very interesting experiment since it kinda stresses ReactOS and forces me to work on ReactOS part I promised I would never work on (I am speaking of CC!). On another side, I will also keep on working on other parts of the kernel as I did previously, trying to improve it and match Windows 2003. I will also switch a bit on FreeLDR, some bugs are calling me there!

About personal projects, I have been quite active during 2010, even if I did not publish about them. One of the project I wanted to publish about before I forgot is a C++ garbage collector. I designed it for several uses, and finally it is more a memory manager than a garbage collector. Its purpose is simple: giving you memory whenever you need it and keeping track about it. It can also performs some operations on it to make your program debug easier such as: memory marking, memory zone tagging. It can also allocate non-paged memory, check against corruption, and so on. It has been designed to work in multi-threaded environment and provides functions for that. For example, when you share memory zones between threads, sometimes you even do not recall who is using what, how long. Here garbage collector becomes useful. Each threads when it uses a zone just needs to reference the memory zone. And once it is done, it dereferences it. Simple mechanism, but that ensure the memory zone will be released once every thread is done with it.
I am not totally done with that project (that is perhaps why I did not publish about it yet) and I plan to finish it and make it a bit closer to a garbage collector. And giving it the ability to allocate and release objects.

Other project I have been working on (and I am still actively working on) is an IRCd “new generation” written in C++. Its purpose is quite simple, implementing the five RFC concerning IRC, optionally adding extra often used/needed features. But, the new thing is that it comes with services implemented in (if built with, of course!). This is quite new, and interesting in my opinion. When you need to rapidly deploy an IRCd, configuring both IRCd, services (when you found the good ones!) can be a pain. With that IRCd, everything comes in. Thus it makes services really efficient as they directly communicate with the server (in a proper way, nothing messy!). And there is no need for SVSMODE, SVSJOIN, etc, commands or equivalents, here you just use services. At the moment, the core IRCd is almost complete and works really well. Services are mostly non existant (excepted OpServ, obviously).
For 2011, I plan to finish that IRCd, and perhaps to use it on heisspiter.net. Time will tell.

Finally, this is the shortened version, but there would be so much to say… Best thing is to keep reading ReactOS’ mailing-lists and this blog to keep informed!

Happy new year ;).

Coming this autumn in ReactOS

September 23rd, 2010

If you are following the ReactOS community, you may have spotted some information about my recent work there. Nothing linked to notifications that time. Let’s come back on how I had the idea to work on that part of ReactOS…

While looking at regressions, I found an interesting issue in the bug #5145. In fact, it was not the issue itself, but the way it was appearing. It was producing a BSOD with bug check code 0×7B (INACCESSIBLE_BOOT_DEVICE), I deeply had the feeling it should have failed earlier with the bug described. So, I looked down in the code and spotted why it was not failing earlier. Indeed, our ARC names handling was pretty old and not returning appropriate information (it was designed in a NT4.0 way, and not returning whether it had failed).

When booting, both ReactOS and Windows need a bootloader. On Windows, it’s called ntldr (up to Windows 2003; starting with Windows Vista, Microsoft switched to winload & bootmgr) and on ReactOS it’s called FreeLdr (or even WinLdr). This loader is in charge of loading the kernel (ntoskrnl or ntkrnlmp on MP architectures) into memory to process to system boot. While starting the kernel it also provides a LOADER_PARAMETER_BLOCK. This structure contains some important information about the system state, especially the partition from which its starts and on which the system is located, and its disks. But, the loader is providing those booting data using ARC names (example of ARC name: multi(0)disk(0)rdisk(0)partition(2)). You may have already seen that Windows is not using such names (nor is ReactOS). It is using \Device\Harddisk0\Partition2. So, that ARC names handling is responsible for: linking ARC names and “Windows” names, but also for finding on which device (and more specifically, on which partition) the system is booting. If it cannot find it, boot process is aborted, and system is gracefully shut down by a bug check using code 0×69 (IO1_INITIALIZATION_FAILED). This means that IO Initialisation Phase 1 failed. That was the error I was excepting to show up, instead of that inaccessible boot device (which means that boot device has been found…).

That is why, when looking at ReactOS ARC names handling code (populated with FIXMEs and hacks), I took an important decision: rewriting it properly, and in a Windows NT5 way. It was important, because the purpose was switching our kernel from a major revision to another higher revision. From NT4.0 to NT5.2 (Windows 2003). But that version change was partly due to two new things: Windows Mount Manager and GPT handling. Indeed starting with Windows 2000 (NT5.0), Windows has been using a Mount Manager, used to mount volumes in a PNP way, and that was already creating them their device name (plus referencing them to registry). So, new ARC names handling is taking advantage of that mount manager to do less work. And it is also using new kernel functions (old one having been extended) to manage partitions (as partition table can be either in MBR format or in GPT format).

So, that rewrite started being a real challenge. Especially since I wanted to adopt a serious implementation way. I was implementing ALL called functions, in order to provide a clean implementation, and also a working one, that would not come with a bunch of stubs. Luckily (or smartly, depends on the way you see it), Microsoft guys left in ARC names handling a legacy mode (coming from NT4.0) in case some drivers would not answer to Mount Manager PNP requests. This meant I did not need to implement the whole Mount Manager in ReactOS (we read an empty registry table) and ReactOS can still boot using legacy mode! First issue avoided. But next issues could not be avoided. ARC names handling is calling IoReadPartitionTableEx() responsible for reading both MBR & GPT. So, I had to implement it… And to add some new features to ReactOS.

I guess you know understand what will come soon in ReactOS. Yes, the ReactOS kernel will now handle properly the GPT. Do not misread me, I am only talking about the kernel. All the other parts of the OS, not relying on the kernel are still unaware about what GPT is. I recently showed my results through a screenshot you can find here. On that screen, you can see ReactOS booting on CD to permit its install, but you can see new IoReadPartitionTableEx() in action, first displaying the four partitions it found in MBR on first disk, and then displaying the two partitions (over 128) it found on GPT with their information. As you may have read, everything is not working yet, it is also is bit hacked to produce such result (GPT partition table checksum computation just fails, as error message shows). Once I was done with ARC names handling, and needed functions with GPT, as trunk was still locked, I decided to finish implementing all the missing FSTUB API. In fact, were missing all functions that were dealing with GPT. So finally, that “small” patch that aimed to rewrite a part of the kernel to fix some bugs and provide a proper implementation ended in a full implementation of a package in the kernel.

But, everything is not that great. First of all, an issue raised in the early while implementing GPT in ReactOS kernel, which made me mail the ros-dev mailing-list. Is there a room for improvement? I do think so, especially since I am nearly done with it. Microsoft is not really matching the EFI standard (and not only on the point I explained in my mail) and ReactOS could implement it properly. It would be a nice advert for the project. Furthermore, to manage GPT properly, ReactOS would need a better storage stack. Most of our current storage stack comes from NT4 DDK (as license permitted it). But if kernel drops NT4 compatibility, it needs a bit more from storage stack. I quickly implemented basic needs. But I could not implement deeper needs. Those are just hacked away in the storage stack at the moment. And since no one in the project really understands how storage stack works, it will be a blocking issue. Finally, I still need to fix some issues in the code I implemented to make it working properly. This may take some more time…

Anyway, in spite of those issues, I really hope I will able to commit a first patch with basic features working in ReactOS this autumn. Only advice I can give you about that is to follow ReactOS’ life, not to miss that. Talking of which, looks like a release is coming for ReactOS ;)

React0S’ status

August 19th, 2010

It’s been a long since I last wrote about ReactOS. This was mainly due to me being away from the project. Or, at least, passive. I was working on notifications, used by FSD (File System Driver). I finally ended my awayness period by writing a documentation about notifications, contenting all the information I gathered during my research about them. Then, I actively started working on them, coding them into ReactOS.

First of all, let’s quickly sum up what notifications are. To make it short, notifications purpose is to notify someone when a change occurs. In an OS, it means you can notify when a volume is mounted, when a file is added to a directory, when a directory is deleted, and so on. There are two ways to deal with those notifications in Windows/ReactOS. State notifications (mount, unmount, lock, unlock) are handled by PnP manager. On disk data changes are handled by both the FSD and FsRtl through a package of functions. So, how to use them? A client application (mostly in user-mode) registers a notification and waits for it to be complete. That’s that easy. Internally, things are getting harder, but the system are equivalent. For state notifications, PnP manager maintain a list of registered notifications. Each time a state change occurs, the driver (or even the kernel) responsible for that change has to call a PnP function which is IoReportTargetDeviceChange() or its asynchronous implementation: IoReportTargetDeviceChangeAsynchronous(). For FSD, there’s even an easier way to notify which is calling FsRtlNotifyVolumeEvent() giving the event that occured. FsRtl will do all the needed stuff and call PnP manager. Once PnP manager is called with a reported change, it just browses the notifications list, finds those that matches the report, and complete them. Caller is then informed about the change. What about on disk changes now? It works the same, just replace PnP with FsRtl. FsRtl maintains the list on FSD demand, browse notifications, takes reports and so on. To register a notification, FSD just calls FsRtlNotifyFullChangeDirectory(), and to report a change: FsRtlNotifyFullReportChange(). It doesn’t have to do more.

So, now, what’s present in ReactOS, and what’s not?

That question is hard to answer. But, let’s try to make the answer as clear as possible.

  • About state changes notifications, technically, everything’s present in ReactOS kernel since revision r47837. It isn’t implemented as it’s in Windows, but that’s already a good begin for having such notifications. I also added some reports. For example, FAT driver reports when it successfully mounts a volume. But, in fact, nothing works. For the simple reason that, IoReport* functions need a PDO (Physical Device Object) to perform the report. Or, Windows (and ReactOS as well) calls the drivers to get the given PDO. In fact, you have a stack of drivers. Higher level is FSD, and lower level is the one that communicates with disk. So, to get the PDO, Windows calls the higher level driver, to ask for relations, and PDO. Then, drivers pass the request to the driver lower they know about in the stack until the last one which complete the request, giving the PDO. It goes down the stack. This side of notifications isn’t implemented as it’s done using PNP requests, and our drivers doesn’t handle them. I recently implemented that support to our FAT driver (r48560), but as the rest of the drivers doesn’t handle it, it does fix nothing. Work will have to be done!
  • About on disk changes notifications, that’s the exact contrary. They work, but aren’t implemented. That could look a bit weird, but that’s not. In fact, those notifications are working in my working copy, but I didn’t release them yet, to have time fixing them. You may have already seen the screens I published about them: http://pierre.heisspiter.net/rostests/notifications.png & http://pierre.heisspiter.net/rostests/notifications3.png. First was the earliest test I did that worked. Full of bugs, of leakages, but it worked. In fact, it’s showing a Microsoft applications designed to test notifications (why using something else? :P) that you can find here. The way it works is easy. You start it giving a directory, and it will register notification on that directory for files changes. And a notification in root directory, for directories changes. So, on first screenshot, you see the application monitoring C:\ drive, and me saving a new file to C:\ called newfile.txt. The application successfully got the report. But given the code, that time, I was really lucky it worked! Second screen comes later, and shows the both notifications working. Saving two files, creating a directory, and the applications (same app started twice with two different directories) printing about both. There, code was getting cleaner and cleaner. Now, code is in a consolidation state. Which means I’m trying to make it rock solid, and I also try to understand details I still don’t get. I hope I may commit that code soon. First, notifications won’t be as complete as in Windows, but it’s a good step into having them. Only issue is that nothing is using them on ReactOS. No application (even not explorer) is registering notifications, and no driver is reporting changes (ie FastFAT/CDFS). So, something will have to be done there as well.

Now, what else? After having fought to get a trunk freeze, I switched to fixing ReactOS instead of keeping on working on notifications. As you might have noticed if you follow the project, the OS recently regressed to a state ever reached. Only a few applications are working, having ReactOS booting is hard, it appears those are due to some memory corruption, deep in the OS. Knowing the origin of such corruptions is quite hard. Indeed, recently, MM (Memory Manager) has been rewritten, and made less permissive than the version we had previously. So, we’ve got three choices: or new MM is broken, or rewrite throw some light on defecting ReactOS components (even in the kernel), or… both! Personally, I tend to believe that the last solution is ours. Even if that sucks. Such revisions show that MM is responsible for a part. Now, what? All the rest is OK? I don’t think so, either think Aleksey Bragin, coordinator of the project, and developer as well. We both agree after short talk that our memory corrupter could be FSD. Indeed, those have been designed and coded in the early ages of ReactOS, when kernel was poor. And then, they have been more hacked than improved to take advantage of kernel new features. I would even go farther actually. Saying that FSD and kernel are both responsible for that status. FSD drivers for calling the kernel with bad/broken parameters and the kernel for lacking proper checks. I’m thinking to a particular part of the kernel here. I’m talking about CC (Cache Controller). This part is involved in caching data, delaying read/write to the disk. So, it’s a heavy MM client. Our current implementation is really poor due to the complexity of the caching process. Checks are low. So, this plus broken driver can make huge damages. I think that’s the situation we’re in. Time will tell. I’m currently trying to track any (really, many are… Only a bit!) broken call to CC by our FSD that might have side effects. And I’m as well thinking about a hackish solution that may be found. For whatever reason, my working copy, with all the changes to FSD and kernel it has (and even to user-mode components!) is largely more stable than trunk. Why? I don’t know. But an hackish solution could be found in, just to help releasing and pushing fastfat_new to replace our current old fastfat driver. But, I’ll talk about fastfat_new and all the projects we (Aleksey and I) have about it in another entry.

Just keep in touch with the project, we’ll keep impressing you, in spite of those… blockers! Once those will be fixed, next release will be great. Really.

How to check for a valid pointer?

June 6th, 2010

Recently, for one of the projets I’m working on, I’ve been asked a really naive question: “How can I be sure a function pointer is valid?”. If the question is naive, the answer isn’t that easy to find. In most programs, when pointers are checked, we just check whether the pointer is null. If not, we assume the pointer is valid. If that test is sufficient for most cases, it happens sometimes it’s not enough.

So my goal here was to find a way to catch wrong non null pointers. I immediately thought about huge and crappy solutions where pointer checks would have been slower than crashing and restarting.
Then, close to the end, you do stupid stuff, such as typing: man end. Here was the solution. 3 external variables are provided by ld when linking a program, and defined by loader when a program is started: etext, edata, end.
Let’s switch back to the structure of a binary. When you build a program, using GCC and without playing with sections, your program is cut into 3 parts: .text, .data, .bss. Interesting part for us is .text. This is were program code is stored. So, when you’re using a function pointer, its address will point into .text section, or it’s not valid. etext meaning end of text (section) is the address of the first instruction after the text section, so every pointer address has to be lower than etext. This gives the first way to check a function pointer. Then, I wondered: pointer has to be higher than something. What? And how to know. What was easy to find: base address of the binary in memory. Indeed, when you start a program, its code is stored in memory. So, I had to find the address of the first instruction. How? While browsing the web, I found that another symbol was also provided by ld: start. This is the address of the entrypoint of the program in memory (most of the time, the main() function of a program is the entrypoint). And most of the time, OS when loading a program puts entrypoint at the begin of the text section in memory.
So, I wrote the following function:
int is_fct_ptr_valid(void *p)
{
  extern char _etext, _start;
  return (((char*) p < &_etext) && ((char*) p > &_start));
}
This way, you can check in a more accurate way if a pointer function is valid.

Then, another question raised in my brain: “OK, you can check functions pointers in a nice way, what about memory now?”. Memory is something harder to check, and I was again a bit lost about how to proceed. I found a first way: most of the time, memory of a program is stored at the end of the program representation in memory. So, pointer address has be higher than end (cf: previous paragraph). And, in fact, having higher address is easy. You just have to use sbrk(0). sbrk is the function you can use to increment heap size of your program, it takes the size of the increment in parameter. 0 means no increment, so it just returns the current higher address.
So, I implemented that and tested. But it failed. I was really stupid thinking that would always work. In fact, in a program, you’ve got two kinds of memory: heap and stack. The method described above is only good at checking heap. Now, the question was: how to check stack then? There was no direct method way to check it, as it was possible for all the rest. Then, I thought about something a bit tricking. When you call a function, memory for local variables is allocated from stack. In fact, you just substract size you need from SP (Stack Pointer) and then, use stack past that given SP. And you do it each time you call a function, even inside another function. And main() is the main function. Then, if you know the address of the stack in main, you know that every other pointer has to be lower than it. And when you call the function to check memory, you know that’s the last called function, so that the pointer to check has to be higher than a pointer you would have in the function. The tricky method was born.
char * sstack;
int is_mem_ptr_valid(void *p)
{
  char estack = 0;
  extern char _end;
  return (((((char*) p > &_end) && (p < sbrk(0))) || (((char*) p < sstack) && ((char*) p > &estack))));
}
int main()
{
  char start_stack;
  sstack = &start_stack;
  /* … */
}
Here, you have the whole process to check memory.

Now, the test program, to show you the whole process:

/* Pointers checkings example */
/* Author: Pierre Schweitzer */

#define _BSD_SOURCE 1
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <unistd.h>

typedef struct _call_t
{
  void (*function)(void *);
  void * data;
} call_t;

char * sstack;

int is_fct_ptr_valid(void *p)
{
  extern char _etext, _start;
  return (((char*) p < &_etext) && ((char*) p > &_start));
}

int is_mem_ptr_valid(void *p)
{
  char estack = 0;
  extern char _end;
  return (((((char*) p > &_end) && (p < sbrk(0))) || (((char*) p < sstack) && ((char*) p > &estack))));
}

void stupid(void * data)
{
  printf(”function’s been called :)\n”);
}

void callfunction(call_t * ct)
{
  if (is_mem_ptr_valid(ct))
  {
    if (is_fct_ptr_valid(ct->function))
    {
      (ct->function)(ct->data);
    }
    else
    {
      printf(”Incorrect function pointer: %p\n”, ct->function);
    }
  }
  else
  {
    printf(”Incorrect memory pointer: %p\n”, (void *)ct);
  }
}

int main()
{
  char start_stack;
  call_t call1, call2, * call3;

  sstack = &start_stack;

  call1.function = stupid;
  call1.data = NULL;
  call2.function = (void(*)(void *))time(NULL);
  call2.data = NULL;

  call3 = malloc(sizeof(call_t));
  call3->function = stupid;
  call3->data = NULL;

  printf(”Test #1: function will be called\n”);
  callfunction(&call1);
  printf(”Test #2: function error will be raised\n”);
  callfunction(&call2);
  printf(”Test #3: memory error will be raised\n”);
  callfunction((call_t *)time(NULL));
  printf(”Test #4: function will be called\n”);
  callfunction(call3);

  return 0;
}

The functions given above, even if they are more accurate, are not fail-proof. Several assertions have been made because they are true in most cases. But, in case you don’t have contiguous memory, in case entrypoint isn’t at the begin of memory, those functions would be senseless. Furthermore, those functions only check if pointer points to a valid memory zone, not if the content are valid. Your program can still have issue due to of-by-one mistakes or such.
But, that’s a nice way to begin! :)

How to check for a valid PDO?

September 27th, 2008

Let’s first define what a PDO is. PDO is an acronym for Physical Device Object. A PDO is a kernel object to describe a real physical device (contrary of logical or virtual device). It’s represented in kernel space by a structure called DEVICE_OBJECT (because it applies to all device objects). And then this structure is shared between kernel and drivers. Here is the structure
typedef struct _DEVICE_OBJECT {
  CSHORT Type;
  USHORT Size;
  LONG ReferenceCount;
  PDRIVER_OBJECT DriverObject;
  struct _DEVICE_OBJECT* NextDevice;
  struct _DEVICE_OBJECT* AttachedDevice;
  PIRP CurrentIrp;
  PIO_TIMER Timer;
  ULONG Flags;
  ULONG Characteristics;
  volatile PVPB Vpb;
  PVOID DeviceExtension;
  DEVICE_TYPE DeviceType;
  CCHAR StackSize;
  union {
    LIST_ENTRY ListEntry;
    WAIT_CONTEXT_BLOCK Wcb;
  } Queue;
  ULONG AlignmentRequirement;
  KDEVICE_QUEUE DeviceQueue;
  KDPC Dpc;
  ULONG ActiveThreadCount;
  PSECURITY_DESCRIPTOR SecurityDescriptor;
  KEVENT DeviceLock;
  USHORT SectorSize;
  USHORT Spare1;
  PDEVOBJ_EXTENSION DeviceObjectExtension;
  PVOID Reserved;
} DEVICE_OBJECT, *PDEVICE_OBJECT;

But before using it, kernel must check whether it received a good PDO. Why kernel? Because most generally PDO is allocated by a driver using IoCreateDevice, which fails in case of error and driver can stop properly. If it doesn’t fail, driver keeps pointer in memory and doesn’t need to check it further. But kernel is receiving such pointers from drivers and must ensure they are good (for example, checking they are not random memory addresses). So, what does kernel do to check those pointers? First, it checks for a non-null pointer. No need to continue if pointer is null. It doesn’t check whether PDO pointer is valid, it checks one of it’s member. As you can see upper, there’s a pointer in DEVICE_OBJECT to a DEVOBJ_EXTENSION structure. Internally, an other structure is used: EXTENDED_DEVOBJ_EXTENSION. It extents the “normal” structure, giving more informations, used by the kernel.
This structure looks like that
typedef struct _EXTENDED_DEVOBJ_EXTENSION
{
  // …
  ULONG ExtensionFlags;
  PVOID DeviceNode;
  struct _DEVICE_OBJECT* AttachedTo;
  // …
}
EXTENDED_DEVOBJ_EXTENSION, *PEXTENDED_DEVOBJ_EXTENSION;
This structure provides an interesting opaque pointer to a DEVICE_NODE structure. This structure, that driver should never change, is the one used by kernel to check the validity of the PDO. So, it first checks if the DeviceNode pointer is null. If not, it checks the Flags member of the structure to find if the flag DNF_ENUMERATED is set (which ensure that the PDO has been correctly initialised). When those two conditions are verified, the PDO is considered as valid.

In case of a failed test, what does kernel do? In fact, there a two different way to handle that. A cool one, for none important cases, and a more drastic one for cases where PDO must be valid. When a non valid PDO is received in a not “important” function, it can just leave with a STATUS_INVALID_DEVICE_REQUEST status and let called handle this case. In important cases, kernel must stop Windows execution to prevent any problems. Only one solution, the call to KeBugCheckEx function (producing BSOD…). KeBugCheckEx is called with specific parameters to help debugging. First parameter, the BugCheck code is set to 0xCA. It’s to indicate that it’s the PNP manager (part of the IO branch) that encounters an error and that it can’t recover it. Then come the first BSOD parameter, 0×2 it’s to indicate that PNP manager received an invalid PDO. And finally, we put the PDO address (to be able to have more informations using WinDbg). Then MSDN speaks about a third parameter. The experience shows that Windows (at least XP) doesn’t fill it.

Here is the way we could see that in C. First, we’ll define a helper macro to check the PDO:
#define IopIsValidPhysicalDeviceObject(PhysicalDeviceObject)
((((PEXTENDED_DEVOBJ_EXTENSION)PhysicalDeviceObject->DeviceObjectExtension)->DeviceNode) && (((PEXTENDED_DEVOBJ_EXTENSION)PhysicalDeviceObject->DeviceObjectExtension)->DeviceNode->Flags & DNF_ENUMERATED))

In this macro, we check the PDO as Windows kernel does. Why Iop prefix? Io because it’s part of the IO branch, and p because it’s a private function. And then let’s see the two cases, first one with return:
NTSTATUS IoSomeFunction(PDEVICE_OBJECT PhysicalDeviceObject)
{
  if (!IopIsValidPDO(PhysicalDeviceObject))
  {
    return STATUS_INVALID_DEVICE_REQUEST;
  }
  // …
}

And the second case:
VOID IoSomeFunction(PDEVICE_OBJECT PhysicalDeviceObject)
{
  if (!IopIsValidPDO(PhysicalDeviceObject))
  {
    KeBugCheckEx(PNP_DETECTED_FATAL_ERROR, 0×2, PhysicalDeviceObject, 0, 0);
  }
  // …
}

FsRtl on ReactOS

August 27th, 2008

FsRtl, understand Filesystem Runtime library, is a part of ntoskrnl (Windows and ReactOS kernel). It provides a set of advanced functions for FSD, FileSystem Drivers, to interact easily with system. For example, most used function by FSD writers must the FsRtlNotify* functions. Those ones, called by the driver, notify kernel (and then user) that a change occurs on volume. It can be change on files (added, deleted, modified, etc.) or a change on volume status (locked, dismounted, etc.). As you can see, those functions aren’t “essentials”; a FSD can be written without them, it will just lack some features.

That’s why they weren’t a priority in ReactOS development, so most of them aren’t implemented, and producing BSOD, Blue Screen Of Death. Indeed, in kernel, in order to preserve system integrity, when a call to an unimplemented functions is done, it produces a BSOD. For example, FsRtlNotifyInitializeSync is written that way in kernel:

VOID
NTAPI
FsRtlNotifyInitializeSync(IN PNOTIFY_SYNC *NotifySync)
{
    KEBUGCHECK(0);
}

It’s KEBUGCHECK macro that produces the BSOD (taking is name from KeBugCheck function). But why have been those functions ignored? First of all, because of their non-vital state, but also because of their lack of documentation. When a function is documented, and all its components released it’s easier, reading MSDN to find an implementation. But FsRtl was really badly documented (and even if efforts have been done, it’s still not enough). Moreover, FsRtl functions are often using parameters that Microsoft calls “opaque” pointers. Concrete, those are pointers to internal structures. Kernel deals with them and FSD are just keeping their pointers to call functions later. So, we point to unknown elements. When writing FSD that’s not a problem, and even more that can be an advantage: kernel do everything. But, for implementation, it becomes a real problem because those internal structures (such as NOTIFY_SYNC) aren’t documented. Some of the undocumented structures can be found with WinDbg, but FsRtl ones can’t be. Only remains one solution: reverse engineering. That way, we can find how ntoskrnl work with those structures, and we can try finding what for are the field designed.

For now, on my FsRtl work, no internal structure has been added. I’ve to become a bit better ;) . Anyway, using the above described methods, I managed to implement, in my branch pierre-fsd, many FsRtl functions, this makes kernel a bit more closer to Windows one. Indeed, I’m pretty proud to say that all Dbcs and Name functions have been implemented and working! But new problems are coming… Last function I implemented is FsRtlNotifyVolumeEvent. And it calls (as Windows one) a Io, Input/Output, function from kernel: IoReportTargetDeviceChangeAsynchronous. And this functions isn’t implemented! BSOD has just been moved…

So adventure has to be continued ;)

ReactOS 0.3.5

June 30th, 2008

It was first planned for April 2008, and finally it’s been released on the 30th of June… The long history of a such delayed release.

It has been delayed in April because of important regressions in code that lead to the impossibility to install or run FireFox 2 (for example). Then (during June), it’s been delayed because of internal problem due to the restrictions dictated to a developer who was coding in a quite bad way (he finally left the development team, and the project). Those problems finally came out, and created many problems because people got afraid about the ReactOS future. It was so an “emergency” to release 0.3.5 (I’ve to admit that I’ve purpose in the middle of June that we should skip 0.3.5 and wait for 0.3.6) to make people not worry about ReactOS. So, developers hurried up to fix bugs. I said developers and not “we” because I didn’t fix any bugs because they were in branch of ReactOS where I’m completely unskilled :(. One of the major bug (the one that made FireFox unusable) got fixed by a external developer who finally enter the development team (his patches are really great!). The rest were fixed by our lead developer, Aleksey Bragin.

And here we are, ReactOS 0.3.5, a free release (ReactOS will stay a FOSS! despite some people words…). After the bugs of the 0.3.4 (it was a really bad release, that’s we didn’t want to re do that!), this will be a great version, more stable. It adds some nice features, and will let people who don’t know English use it easily.

Now, some words about the work I’ll realize for ReactOS 0.3.6. As you perhaps see (if you’re a ReactOS addict!), I stopped developing on trunk. Colin Finck created (on demand, I wasn’t obliged such as former developer…) a branch to all stuff related to FileSystems Drivers (FSD). Those drivers let ReactOS reads data from various partitions type. Today, ReactOS only support FAT32 which is really limited. So using this branch, I’ll try to let it support Ext2 and NTFS partitions. I’ll also try to improve our FAT32 driver and our FSD support (some Kernel APIs are missing). I’m not sure I’ll be able to achieve some minor features for 0.3.6, but let’s hope! ;)

Have a nice day… on ReactOS!

Edit on the 4th of July: I just would like to add a piece of advice to people who could have planned to use ReactOS NTFS features. It appears (well, in fact, it’s sure) that I introduced a really bad bug in revision 34036 (with two others that have been fixed before 0.3.5). It’s not been fixed yet (even in my branch, but perhaps in my working copy (WC)). It has only been hack fixed right now (but after 0.3.5 release!). What does it do? It makes ReactOS freezing at different stages. So, the best method to avoid it is not using NTFS volumes on the testing machine. And if there’s no choice, don’t try to browser any NTFS volume (moreover, even if there wasn’t that bug, it wouldn’t work :p).

I’m sorry for the inconveniences caused by this bug :(.