Theoretically secure whitelisting

Discussion in 'sandboxing & virtualization' started by Gullible Jones, Mar 21, 2014.

  1. Gullible Jones

    Gullible Jones Registered Member

    Joined:
    May 16, 2013
    Posts:
    1,461
    We know the ability to run arbitrary binaries can be easily abused on Windows, so one of the preferred methods of blocking malware is to whitelist executables. This works fantastically for common desktop stuff; not so much for relatively advanced malware like Stuxnet (or an acne-pocked teenager pointing a copy of Metasploit at you).

    The problem is partly that, in a way, this is still blacklisting. What you're blacklisting is a few system calls. But if someone can mess around in a program's memory space, they might be able to make it load and execute a payload without resorting to those system calls you block.

    How about going a step further and whitelisting by system calls, instead of by executable files?

    Linux actually has a way of doing this. :) It's called seccomp, and Chrome uses it. The HTML renderer, and also the Flash plugin in recent versions, are only allowed to invoke the predefined set of system calls that they need in order to function.

    IIRC the way seccomp works involves building the whitelist into the application binary, or something along those lines. Aside from Chrome, not many applications use it.

    What about instead building a system wide profile of what programs can invoke what system calls? Instead of filtering specific ones, have:
    - A driver the intercepts every kernel function exported to userspace.
    - A database of which binaries can make which system calls.[1]
    - Some kind of simple user interface that lets you set a learning time period before the system is locked down.
    - A means of resetting the database (backing it up in the process), so that the system can be reconfigured.

    So, like seccomp filtering, except designed for Windows and applied on a global basis.

    Is this workable? Theoretically possible? Completely half-baked?

    [1] Not sure what this ought to look like, I'll follow up with a post describing an idea for that.
     
  2. Gullible Jones

    Gullible Jones Registered Member

    Joined:
    May 16, 2013
    Posts:
    1,461
    So the database would have to store up to 1000+ entries for probably thousands of binaries... Yech.

    I was thinking maybe something like this would do:

    - The database is a binary tree of node structs.
    - Each node contains the usual left and right pointers; two 64 bit integer checksums created using different algorithms (in case of collisions); and a pointer to a struct containing the list of allowed calls.
    - Each "list" is an array of 32 64-bit integers. The integers act as bit masks. :) System calls are enumerated in groups of up to 64 more or less related calls. To check if a system call is allowed, the driver checks the requisite bit in the requisite integer. 1 means allowed, 0 means denied.

    But wait, if we're talking about perhaps a million database entries, that would be about 1,000,000 entries, by a little over 2 KB each, which is 2 gigabytes. Okay, that is totally not going to work.

    Edit: the more I think about this the more it becomes clear that one would run up against memory limitations. There is no way to individually enumerate all allowed or denied system calls for all binaries on a system without immense memory consumption.
     
  3. Gullible Jones

    Gullible Jones Registered Member

    Joined:
    May 16, 2013
    Posts:
    1,461
    Okay, try two. We can't do individual profiles, so let's do groups.

    - Again I'll start with the binary tree
    - Each node has the left and right pointers, the two checksums, and a pointer to a group
    - Each group is a 32 element array of 64 bit integers (again) with each integer (again) acting as a bit mask, and the system calls enumerated as before

    Upshot is if you have 20 groups and say 3000 database entries, you have... (224 B * 3000) + (64 B * 32 * 20) = about 7 MB. Maybe bloated, but much better!

    The problem here is whether it's fine grained enough. The grouping would probably demand manual control rather than automatic profiling, and asking an administrator to configure 1000+ system calls for each group is madness. You could abstract things further and group the system calls together as single flags... Which is basically back to being a traditional HIPS.

    Okay, forget the OP. This idea sucks.

    Edit: mind, I am (still) a C newb.

    It would be interesting to have some input from people who actually develop HIPS/FW software, assuming such can be contributed without revealing proprietary info.
     
  4. Hungry Man

    Hungry Man Registered Member

    Joined:
    May 11, 2011
    Posts:
    9,148
    Hey, just on the first post, I can edit this later:

    There's where this stops being workable. Windows has thousands of undocumented system calls. And no secure method of intercepting them without starting to work with DAC (how Chrome does it). The problem isn't writing the program it's dealing with Windows.

    Your whole hash table/ node struct isn't the issue though - that part's not hard at all, there are a *ton* of tricks to handle that. First of all if you're just doing userland hooking there's not a lot of memory overhead at all, you'd implement it as a single driver and just hook and point to that driver, which will *overwrite* your old code with a jmp to the new code - little overhead.
     
    Last edited: Mar 21, 2014
  5. Gullible Jones

    Gullible Jones Registered Member

    Joined:
    May 16, 2013
    Posts:
    1,461
    Thanks.

    What exactly do you mean "starting to work with DAC"? Windows has functional DAC in Vista and later.

    Re the hash table, I guess if the part dealing with it is in userspace it can be mmapped and not loaded wholly into RAM... But I clearly don't know anything about how databases are designed. :) What's a good way of having a searchable database of that size that doesn't eat lots of space?
     
  6. Hungry Man

    Hungry Man Registered Member

    Joined:
    May 11, 2011
    Posts:
    9,148
    Yes, it has DAC. But you'd just have to start brokering every process, it gets a lot more complicated because DAC isn't fine grained on Windows or any other OS.

    Well, for one thing you wouldn't need to load the entire database at once, especially a tree.

    But all you'd have is a single driver, and then you could just load part of the tree or something per-process, and I doubt there's gonna be 1000 processes loading at once.

    edit: Alright, not in the right state to be posting a ton on this. Tomorrow I will elaborate.

    edit2: Here we go...

    Any type of hooking for system calls would have to be done in userland. That isn't secure - a little extra ROP and you've bypassed it entirely (most userland hooking libraries actually warn you about this). Windows provides not a whole lot of secure built in hooks, certainly when compared to the 1000+ system calls.

    So what you'd have to do to get anything working is remove a programs rights entirely, and broker that responsibility to some other process. Then it doesn't matter if your hooks are secure or not.

    In terms of memory why do you need it all loaded into RAM?

    Let's say a process has a 'profile' of calls. Store that profile in RAM if the process is running. Then, when that process runs, have a JMP to the code you've injected into it that redirects the call to your driver. Your driver checks the call against the profile and that's it.

    You only need one driver and one list of system calls, and then a list of programs. The hard overwriting adds little memory overhead since it's offloading decision making to the driver. All programs would use the same driver.

    Your Windows system has 1,000+ system calls but it doesn't take up 2GB of memory right? :p
     
    Last edited: Mar 22, 2014
  7. dw2108

    dw2108 Registered Member

    Joined:
    Jan 24, 2006
    Posts:
    480
    If you take a close look at your counting, then you shall see that the binary growth is going to be really factorial growth, much, much faster than exponential growth. See J C Shepherdson & H E Rose, Subrecursion Theory, Oxford University Press.

    Dave

    PS: It is a trick for these anticrapware software writers to make their best guesses eventually. Complexity pushes them off the intended, exquisite target.
     
  8. MrBrian

    MrBrian Registered Member

    Joined:
    Feb 24, 2008
    Posts:
    6,032
    Location:
    USA
    From Windows security sandbox framework:
    From the paper:
     
  9. Gullible Jones

    Gullible Jones Registered Member

    Joined:
    May 16, 2013
    Posts:
    1,461
    @MrBrian: Wow. That looks like basically a policy sandbox remix of PyDBG. I'd never even thought of using debug breakpoints to sandbox programs. Brilliant! And something similar could probably be done on Linux. Thanks for mentioning this!

    Related: in addition to the INT 3 hardware breakpoint, there are two kinds of software breakpoints available on x86. I wonder if those would be usable for sandboxing too, and what advantages/disadvantages they might have?
     
  10. MrBrian

    MrBrian Registered Member

    Joined:
    Feb 24, 2008
    Posts:
    6,032
    Location:
    USA
    @Gullible Jones: You're welcome :).

    I was unable to find the open source code mentioned in post #8, but I spent maybe only a minute or two looking.
     
  11. Yuki2718

    Yuki2718 Registered Member

    Joined:
    Aug 15, 2014
    Posts:
    1,257
    Seems like a spark of genius! (exaggerated?) I would never thought of using breakpoint to hook.
    I'm curious to know if there's other such debug-function-based sandbox or HIPS, or are there any malware using this method to hook function?
    If I understand it correctly, you mean the affordable way (and many sandbox actually use this) is,
    delegate all security dicision to a broker process, and it decides whether a request from a monitored process should be allowed (including allow but modify case) or denied, and it only needs the least size list of APIs which can be allowed depending on condition or context, all other request wll be denied so no need to know all or 1000+ APIs?

    Also can I ask you, in 64 bit Windows (Vista+) what method can be used to hook Native APIs other than inline/prolog hook? Can it be done in user-mode?
    Even placing inline hook seems to be not trivial as the first instruction in Zw* function is 3 bytes (same goes Nt*? I even don't know what the difference btwn Zw & Nt), are there way?
     
  12. Hungry Man

    Hungry Man Registered Member

    Joined:
    May 11, 2011
    Posts:
    9,148
    Debug hooking is not reliable, as far as I know.

    @Yuki2718
    Depends on what you want to hook. 90% of what AV and sandboxes do probably resides in some sort of filter driver or file system driver, and the rest is relegated to some other form of hooking. Your driver would get information, talk to a broker OR make the decision itself based on a policy.
     
  13. Yuki2718

    Yuki2718 Registered Member

    Joined:
    Aug 15, 2014
    Posts:
    1,257
    Thanks for reply.
    But filter driver is a part of Windows own mechanism so it means there're not much room for software to hook Native APIs in 64 bit except the way MS supplied to use or by making own driver?

    I also find callback can be used to behave like a hook (not sure if it is really hook, I can't understand all of following link).
    https://stackoverflow.com/questions/20552300/hook-zwterminateprocess-in-x64-driver-without-ssdt
    To be honest I don't know what kernel-driver can or can't except I/O intercept by filter driver.
    Can I place a hook to any Native API if I have kernel-driver, or is it still restricted (in 64 bit)?
    Sorry for many questions.
     
  14. Hungry Man

    Hungry Man Registered Member

    Joined:
    May 11, 2011
    Posts:
    9,148
    Yes, by design you are only able to do certain things in very MS approved ways. This makes sense, it's how APIs work in general - work through some sort of interface, that way you can maintain a valid state internally.

    Callbacks are just functions. So I can pass a function1 a function2, and then when that function1 executes it'll (depending on the function) call function2. Really nice, functional programming, and it lets you do many awesome things. Hooking, idk, I can see them used *together* but not as a replacement for one or the other necessarily. I think kernel callbacks are probably one way of doing it but that's not something I know much about.

    File system filter drivers are only going to work for intercepting file system related calls, as it sits on top of the file system driver itself.

    For further interception, you'd have to do something else. My suggestion would be to set up another driver, or process, remove a processes rights to do anything (like with untrusted IL), and then use userland hooking to redirect all calls to that driver/ process. That is the design I came up with a long time ago, and I've never implemented it in a meaningful way.
     
  15. Gullible Jones

    Gullible Jones Registered Member

    Joined:
    May 16, 2013
    Posts:
    1,461
    I thought that since XP, kernel hooks used a set of no-op bytes at the start of each hookable system call? There might be better mechanisms since Vista though, not sure.

    @Hungry Man, couldn't the userspace hooks for redirection be overridden from userspace? Or do you mean using a restricted token to strip the sandboxed processes of all rights, and giving some rights back via the kernel driver?

    I do wonder if it would be possible to use AppContainer/picothread sandboxing on Windows 8 and later to contain Win32/Win64 apps, instead of just WinRT ones. OTOH I also wonder if it's worth dealing with Windows at all any more. Windows 8 seems to me to inherit the worst of both Linux and Windows - rapid release cycle, bugs everywhere, obscure interface...

    Oh, and callbacks and function pointers rock. C is a great language; I just wish C compilers were a lot less lenient.
     
  16. Yuki2718

    Yuki2718 Registered Member

    Joined:
    Aug 15, 2014
    Posts:
    1,257
    Just use lenient warning option always :D (kidding)
    Misread.:blink:
    Then, always use -W and -Wall if you use gcc.:thumb:
    (but maybe you mean vulnerable coding check?)
     
    Last edited: Nov 20, 2014
  17. Gullible Jones

    Gullible Jones Registered Member

    Joined:
    May 16, 2013
    Posts:
    1,461
    @Yuki2718 yeah, I mean vulnerable code checking. There are static analyzers for that, but C compilers should spot more IMO.

    (Actually LLVM/Clang seems to have reputation for that, hopefully it will change the game a bit.)

    Re options, even with -Wall -Wextra there are some warnings you won't see; mostly recently added ones that could indicate vulnerabilities. To make GCC actually warn on everything it ought to requires a whole paragraph of flags. IMO it and other compilers badly need a -Weverything-and-the-kitchen-sink option.
     
  18. Hungry Man

    Hungry Man Registered Member

    Joined:
    May 11, 2011
    Posts:
    9,148
    @Gullible Jones
    No clue, could be. That would certainly make things a lot nicer.

    The latter. Remove all rights, use userland or whatever facility you like to give rights back. Chrome's method.

    I personally just don't bother with Windows. One day, when I'm ready, I will tackle that. It's just stupid how the yhandle this. I had a nice long talk with a MS security manager about this, and they were in agreement. They make too many silly decisions, and at this opint I've got better things to do than delve into the 12 dictionary sized books it would take to understand Windows internals.

    If you like functional programming in C, try C++. C++11/14 have added a ton of excellent facilities for this. std::function, functors, lambdas, etc. C++17 will improve further, but right now you can do very cool thigns. And it's not nearly as lenient as C, because you're working with containers/ wrappers that must maintain valid states as part of the standard.

    When people write C++ the way it's meant to be written (as in, not "C with objects") it's really a great language. And not horribly unsafe.

    As for flags:
    -Wall -Wextra -pedantic

    Those are what I compile with for warnings. Clang has multiple static analysis tools, and very clear and readable warnings. It destroys g++ in that regard.
     
  19. Gullible Jones

    Gullible Jones Registered Member

    Joined:
    May 16, 2013
    Posts:
    1,461
    @Hungry Man: -pedantic makes GNU extensions generate warnings (and still doesn't enable some newer warnings IIRC).

    Re functional programming, I'm not much of a programmer at all. My main experience with functional languages has been a few experiments with Haskell and Ocaml; which basically got nowhere, as those languages have impossible formal-math-like syntax and absurd amounts of features.

    Mind, I think C's syntax and compactness are the best things about it. The syntax is both terse and easy to pick up, if used correctly; and the lack of umpteen million features makes it easy to keep the whole thing in one's head.

    Also, the more formal/academic functional languages make it difficult to use iteration, or anything that requires state. I can very well understand the reasons for that, especially in writing heavily multithreaded applications; but using tail recursion for everything gets ridiculous.

    ... Also, data structures in Haskell seemed opaque and inflexible. Not sure about Ocaml.

    As for actually doing functional and OOP stuff in C, it's really fun but I'm really bad at it. :) Thus far I've used it to implement a new spellcasting system in a roguelike game, and that's basically it.
     
  20. Hungry Man

    Hungry Man Registered Member

    Joined:
    May 11, 2011
    Posts:
    9,148
    If you know of a way to enable further wt arnings, I'm open to hearing them. I at one point had a big paragraph of warnings, but I replaced them with just those 3 and afaik they covered all of them.

    Thankfully C++ does not really adhere to a paradigm, so you can do anything. But I'm very happy with the functional stuff.

    Still cool :p
     
  21. Gullible Jones

    Gullible Jones Registered Member

    Joined:
    May 16, 2013
    Posts:
    1,461
    I'll give C++11 a try when I have the time (during the upcoming vacation maybe, if the server farm doesn't implode or anything). Never really got confident with C++ in general, probably because it's a huge language with an even huger set of unofficial extensions.

    Also have to admit I dislike full OOP with objects for everything. And inheritance. Especially inheritance. I prefer C's barebones approach to OOP, using structs and function pointers.

    (Speaking of which, have you taken a look at Google Go? It's supposed to be a sort of C++ Done Right, as interpreted by Rob Pike and company. Docker is written in it, among other things...)
     
  22. Hungry Man

    Hungry Man Registered Member

    Joined:
    May 11, 2011
    Posts:
    9,148
    Just treat it as a totally separate language from C, use cppreference.com, and you'll have fun.

    Inheritance is gross. Mostly, OOP is gross. But with OOP you can get cool things like singleton model OOP, which makes sense. Data oriented OOP works well too.

    C++ doesn't really force things on you. But if you want to write good C++ code, don't start writing C code in there.

    I feel like Go is a fine language to learn, but outside of education, I find C++ to be more useful. Things like concurrency were a big part of Go, and C++ certainly lacks in concurrency, but 11, 14, and 17 bring it well up to speed. 17 in particular will add some very interesting features, like threadpools (I have to write my own threadpool class until this is implemented) among many other things.

    I think Rust is a serious contender, but not for another 5 years.
     
  23. Gullible Jones

    Gullible Jones Registered Member

    Joined:
    May 16, 2013
    Posts:
    1,461
    @Hungry Man you're talking about the singleton pattern in OOP? That looks kind of like the OO equivalent of a lock, I guess?

    I had to look up what a threadpool is. Curious what you're using that for (if it's not proprietary work-related stuff anyway).
     
  24. Rasheed187

    Rasheed187 Registered Member

    Joined:
    Jul 10, 2004
    Posts:
    8,026
    Location:
    The Netherlands
    I've read the paper, and I only understood the first part, the rest was too technical. So can anyone explain what's so cool about it, how is it any different from Sandboxie?
     
  25. MrBrian

    MrBrian Registered Member

    Joined:
    Feb 24, 2008
    Posts:
    6,032
    Location:
    USA
    From a brief skimming, I got the impression that it was about sandboxing via blacklisting/whitelisting system calls on Windows, so I thought Gullible Jones might be interested.