Links
Tags
apache
armenia
books
bsd
c
c++
chips
cinema
concurrency
cooking
database
dragonfly
erlang
filesystem
freebsd
fun
hardware
java
javascript
json
languages
linux
lyric
mac_osx
mail
math
misc
music
personal
poems
presentation
programming
python
references
ruby
rubyjs
scm
software
spiking_neural_net
study
sysadm
sysarch
technology
testing
travel
virtualization
web
wee
windows
Well, I'm currently working on our PeopleSearch engine. It's part of a student project at Fraunhofer Gesellschaft, one of the big research labs in Germany. Below, there is a screenshot:

For the prototype that we want to show tomorrow, I took the logo of my Guuglehupf search engine that I wrote 5 years ago in MLton. The logo is of course copyright by my friend W. Maier, who supported me in Guuglehupf. So it's just a temporary solution :).
The credo of this project is: Java sucks! Java sucks! Java sucks! And this is not even my own opinion, as I coded in Ruby from the beginning, but if you compare the results from the guys that used Java to those of Ruby, it's incredible. Spending one week with Java for almost nothing is natural. Uhh, and don't try to read the code :).
The first week was mostly spend for exploration. We wrote extractors for various pages, collected ideas. This second week was more structured. We made plans and had more concrete ideas. But we are still in exploration mode. There are so many free variables left. Extracting and caching is easy. Now the more advanced ideas have to come into our minds. To realize those ideas we need Ruby, or we will fail. From now I can tell that Mongrel, Ferret and Hpricot are great tools for such kind of task. I use them all. For testing I use RSpec, but I'm not yet sure it fullfills all my need.
The last two days I attended the FrosCon here near Bonn. I didn't got much new insights from the presentations I attended, but I didn't expected this. It was just nice to see some FreeBSD guys. They were quite talkative :). Very friendly people everywhere. Even musicians with their Jacklab Audio Distribution. This might be interesting for me or my dad to record music. The two Perl guys seemed not to get high traffic on their booth, so I went to their booth, but then I didn't want to fool them that Perl is dead and stuff like that :). In the end I happened to sit on the same table as a guy from open source press who heard my "comments/corrections" during the presentation about Ruby and Design Patterns. And then he knew Armin :)
Ah, I remember what was interesting as well. One of the FreeBSD guys (there were many of them :) said that here on this booth, there's only FreeBSD, Postgres and KDE :). And he said this with a deep self-evidence. About Postgres he said, "We are all the same - have the same root - you know". So I guess Postgres has it's origins at Berkeley :)
Uncle Bob (Robert C. Martin), founder and president of Object Mentor and author of many books about agile methods and stuff like that, writes the in his blog about the truth about XML.
"From now on anyone who considers themselves to be a serious professional must refuse to write another line of XML."
C-Jump is a Computer Programming Game Board for people of age 11+:
I'm just reading this article that compares Java (hedgehog) with OCaml (fox). Mostly, I like this article due to it's nice picture of a hedgehog.

This remember me to a story that I was told as a child. If I remember correctly, it's about a hedgehog and a bunny. They make a foot race, but the hedgehog is so intelligent that his woman-hedgehog waits at the destination and as the bunny cannot distinguish her from her husband, he thinks that the hedgehog was actually faster (which can't be as bunnies are faster). This proves that hedgehogs are actually very intelligent, and I really like them more than foxes. The conclusion therefor is, that Java must be a bunny and Ruby a hedgehog, and OCaml can be a fox if you like :). Did you know that foxes cannot kill hedgehogs?
The Mercury declarative language got a Erlang backend!
Okay, I am in the progress of porting Google Web Toolkit to RubyJS. The first 2500 lines of code (that includes a lot of comments) went pretty well. So far the basic is up and running. 99% of the DOM code (which I split up into Event and Element classes and renamed some methods accordingly) is functional. UIObject and Widget classes as well. But then it really starts to become Java-centric. Iterators, a lot of interfaces (e.g. class Label implments following interfaces: SourcesClickEvents, SourcesMouseEvents, SourcesMouseWheelEvents, HasHorizontalAlignment, HasText, HasWordWrap) and big switch statements make it now more difficult to continue with a mostly line-by-line translation. While I could just port everything line-by-line, I don't think this is the Ruby way. Right now I am looking at the Tree.java file, whose onBrowserEvent method (the Widgets central event dispatch method) contains a big, no a huge switch statement. As long as each case is followed by a break that's just fine. But it's not the case. What I really find ugly is this piece of code:
case Event.ONKEYDOWN:
/* lots of code removed */
// Intentional fallthrough.
case Event.ONKEYUP:
if (eventType == Event.ONKEYUP) {
/* ... */
}
// Intentional fallthrough.
case Event.ONKEYPRESS:
How can a human understand that. It's as bad as goto.
Instead of huge switch statements, I have invented a event dispatch mechanism, which comes at no memory overhead (or only very little, at least no new object exists) and only one additional method call. So you can write:
mapEvent(Event::ONCLICK, `#<self>.#<m:onClick>`)
mapEvent(Event::ONMOUSEUP, `#<self>.#<m:onMouseDown>`)
and the ONCLICK event will trigger the onClick() call. If this extra method call would be too much overhead, this could be removed by moving the code into the Event dispatcher code (which currently calls onBrowserEvent).
I am currently reading this article about Priority Inflation, which means that if only bugs of priority 1 or 2 get fixed, users that really want something to get done, assign their issue a priority of 1 or 2. To avoid that, there should exist some kind of bug scheduler, which automatically assigns bugs to developers in a way, to also bugs with lower priority get fixed, in a similar way as processes in an operating system get CPU time assigned. But I really think this is not needed :)
Online-Durchsuchungen sollen ja bald legal sein. Mir ist nur unklar, wie die Polizei das anstellen will. Entweder kooperiert die Polizei mit Microsoft, sodass bei einem Update der sog. Bundes-Trojaner aufgespielt wird, oder sie probieren es über E-Mail. Aber wenn der Benutzer nicht allzu blöd ist, und seine Software nicht allzu veraltet ist (in bezug auf Sicherheitslücken), dann sollte selbst die Polizei keine Möglichkeit haben, einen Trojaner einfach so auf ein System zu spielen. Oder es führt dazu, dass alle Kriminellen auf Linux/BSD umsteigen. Daran hat die Polizei bestimmt nicht gedacht, dass auch Computer-Systeme existieren, die kein Windows drauf haben. Ich glaub das alles ist technisch viel mehr heisse Luft. Vielleicht verängstigt es ja den einen oder anderen Klein-Kriminellen. Aber Terroristen können sie so bestimmt nicht fangen, denn die sind bekanntlich ja besser ausgerüstet als die Polizei.
OpenID is a single sign on system in use today, gaining popularity. Of course users should take care not to get phished, as you can read here. I don't think it's much of a problem...
I've been thinking about sign-on systems for some time. My approach is to reuse existing systems like email for authentification purposes. This is already widely used to confirm signups. You all know those confirmation mails containing a special link to click to confirm that you signed up. Given that you have an email application available, it's very easy to provide single-sign on. Of course phishing is also an issue here, but could be solved for example using PGP signed messages (requires support in the Email application).
When you want to design a sing-on system, there are many things to consider. For example, whether the user uses a machine that belongs to him or herself. If for example I use my own computer to log-in to a page, I could use client-certificates for SSL authentification. I think most browsers support this. But no user knows about this, and I think there are no web-applications that make use of it. But it would be the most secure and best solution, as you'd see the certificates of the server and have to trust them and it would automatically imply that all traffic is encrypted.
But what if you want to sign-in from an untrusted computer, e.g. from an internet cafe? At first, you will not have your certificate with you, second, the computer could be compromised (key-logger etc.). So you'd not want to type in a password anyway, except if that password will ever be used once, like those "passwords" you know from the TAN list for online-banking. One time passwords can be for example used to login to a FreeBSD system.
So if you consider logging in from an untrusted computer, a sign-in system should also give the ability to specify a one-time password.
In short, I'd suggest:
- For your home computer: transparent authentification using a client-side certificate + client-side master keyword
- Trusted computer with Email access: Email-based authentification
- Untrusted computers: One time passwords
While reading a lecture about Semantic Web hold at my university this summer term, I came across the owl:thing and owl:nothing classes (or whatever else they are :). This was the first time I noticed that "nothing" actually comes from "no thing".
Well, all this stuff is nothing new, so I read through the whole lecture in two hours. I am sure that after attending this lecture some students think that an ontology has something to do with XML. I am not sure if it's a good idea to spend three lectures just explaining XML. In my opinion all this RDF or OWL-stuff is just making actually implementing things like semantic web much harder than it actually should be. Think of Prolog (a logic programming language). How old is it? 30 years? After looking at Wikipedia, I know that 30 years was a quite good guess as it was developed in the 1970-th. Everything what you can express in RDF can be expressed in Prolog much easier. While RDF just describes relations between subjects and objects with a predicate, you can actually "evaluate" them in Prolog and build rules based on predicates. In RDF you need a special (SQL-like) query language to do the same. I have the strong feeling that if you'd like to implement RDF, you have to spend most of the time for just parsing the XML or the query langauge and not for actually doing anything useful. But that's the case for everything XML-ish :). I believe in languages, not in pure descriptions.
I thought about representing RDF in JSON:
{ "Netherlands": {hasCapital: "Amsterdam"},
"Amsterdam": {type: "City"},
"Amsterdam": {areacode: "020"} }
I don't want to give the equivalent in RDF, as there exist numerous variants in how to describe the same facts in RDF.
In the end XML is like Java, making things harder than it should be, keeping themselves busy. That's why Google uses JSON instead of XML and *BSD (excluding Mac OSX) made a statement to not use XML for their configuration files :)
Thanks to why's article I learned about the Processing langauge.
This popped up into my head just a few seconds ago while thinking about RubyJS. I'm a big fan of MLton, a whole program optimizing Standard ML compiler. I'm even listed there as user, as I wrote my Guuglehupf search engine using MLton.
So in RubyJS I also do whole program "optimization". Yes, really! Because I can take a look at the whole application that is going to be translated into Javascript. That's a huge benefit, as you can for example obfuscate a lot better than any other obfuscator that works on a file by file basis could ever do. As I wrote in my last blog entry you can also do unused method removal.
After 3 (in words "three") days of hard work and 1500 lines of code, the new completely rewritten version of RubyJS is nearing completion. I have learned a lot from the old version, and that's why I took the approach to rewrite everything from scratch. It has paid out. The code now is much cleaner and it's so much easier to extend (add optimizations). The generated code is much more dense and should execute much faster! This is thanks to direct method calls, instead of going an indirection over a rubyjs_send method. In the old version of RubyJS this indirection was needed to make method_missing working, but now I took a better approach, which lets you generate code like obj.method() instead of rubyjs_send(obj, method).
If you'd take a look at the implementation of rubyjs_send you'd see a lot of if statements, that check the type of the receiver and then dispatch to the method. While this makes it possible to use the special values "null" and "undefined" as receicer (they are special in Javascript as they don't have any attributes or prototype associated with them) it is just slow. I assume the new version to be around one order of magnitude faster in the end, as there is absolutely no overhead in calling a method, thanks to the introduction of "nil", which is a special object. This of course introduces new problems like initializing instance variables to "nil" before using them, but I figured this out.
There are still some things missing in the new version, but they are pretty straight forward to implement, especially as I've already implemented them in the old version. So it's mostly copy and paste and then modify a few bits.
The new version generates not only very condensed code, it also generates unreadable Javascript code. This protects your future client-side Ruby knowledge from being decompiled. You will be able to tell RubyJS to not generate tables that contain mappings between the Ruby method name (unobfuscated) and the method name used in the generated code. Those tables are required if you want to use send or methods etc. I think they are really not used that often, so it's worthwhile to have such an option, and it makes it nearly impossible to decompile it back to understandable code :). There are further options for generating faster code, omitting arity checks and so on.
What has changed as well was the method calling convention. In the old version I used:
function(self, args, block)
whereas I now use:
function(block, arg1, arg2)
All arguments are part of the function signature, except splat (catch-all) arguments, that collect all further arguments. In the new version they are constructed via Javascripts "arguments". That's by the way an ugly thing. I thought it would be a real Javascript Array but it isn't. So you have to loop and copy every element to a real Array. That's ugly, but it's the price you have to pay. The advantage is, that in the new version you no longer create an array per method call, which was a big overhead!
I really think given another day of hard work, RubyJS is (mostly) feature complete and I can start working on doing something "real" with it. Last night (this morning :) I took a look at Google Web Toolkit, especially at the API and class hierarchy. While it's Java, they did it right! It's not unusual to see over-designed Java APIs, but this for sure is not one of them. Of course in Ruby it will be much easier as you don't need all those interfaces. Porting the widgets code and DOM to Ruby I assume will take me another 1-2 days. I will not port it line-by-line, I'll write my own code, just copying most of the API, which I don't want to reinvent.
I'm also very interested in a comparison towards Prototype. This gives an infrastructure to write Javascript code in a more Ruby-like fashion. This is all part of the core libraries in RubyJS and is written in Ruby (with inline Javascript where needed). I'm interested in see how the generated code size compares to the size of Prototype AND how the execution speed compares. I think size-wise there will be no big difference, maybe RubyJS will generate even less code, due to variable name obfuscation. Performance-wise I assume RubyJS to be not more than twice as slow as Prototype. With optimizations turned on, I think there will be no big difference at all. Of course I'm hoping to see it run faster, but it's hard using a dynamically typed language like Ruby to generate more efficient code than hand-written code :).
In a thread about this new Javascript Rails port I read about the fact that Javascript doesn't have method_missing, which is abused to ad-absurdum in Rails. So this made it harder to proper port it to Javascript. In RubyJS you have method_missing thanks to the compilation step! You can really do a lot when compiling. I will also do method removal of unsed methods. Again thanks to compilation and static method call recording.
You might not believe it, but it's really true! The code given below is Ruby code ;-)
function(_g,_b,_c,_d){var _a,_e,_f,_h;_a=_h=nil;
if(this.$c===undefined)this.$c=nil;if(arguments.
length<2)throw('ArgumentError');if(_c===undefined)
_c=_b;if(_d===undefined)_d=2;_e=[];for(_f=4;_f<
arguments.length;_f++)_e.push(arguments[_f]);;
this.__invoke(_g,'$a',rubyjs_splat(_e));this.$b(
nil,1,2,3);this.$b(nil,2);_b.$a(nil);_b=_c=_d=1;
_h=1;_h=this.$c;this.$c=1;if((_f=this.$c.$d(nil,4),
_f!==false&&_f!==nil)){_b=_c}else{if((_f=true,_f!==
false&&_f!==nil)){}};_b=((_f=true,_f!==false&&_f!==
nil)?1:5);while((_f=true,_f!==false&&_f!==nil)){
_b=_b.$e(nil,1);_b=_b.$e(nil,2)};_a=3;return _a}
Well, not 100% true, but it is generated from Ruby code using RubyJS. The Ruby method is actually very complicated, using splat arguments, block arguments etc. I assume that the equivalent method in hand-written Javascript would be around the same size. I can easily omit arity checks and stuff like that to reduce size even more.
Now I solved the hardest part of my Ruby to Javascript Compiler. The implementation of Ruby's metaclass model. Last time, it broke my head, so I took another approach. This time I wanted to do it right, and after some hard thinking and a little grain of magic I did it. Object is a Class. Class is a subclass of Object. Class itself is a Class. Who doesn't get confused?!
In Ruby really everything is an expression (as well as an object), while in Javascript there is a distinction between statements and expressions. For example "if" or "while" are statements. In Ruby you can write for example:
a = if a > 0 then 1 else 0 end
Which is equivalent in Javascript with:
a = a > 0 ? 1 : 0
The above is of course valid Ruby code, too. So if you want to write a Ruby to Javascript translator, use "?:" instead of "if". But how to map "while" to an expression in Javascript? Well, the easiest approach is using a function:
while(cond) { stmts; }
// maps to
(function(){
while(cond) { stmts; }
})()
That makes it very easy to translate between Ruby and Javascript. The whole method body is one expression, so the Ruby code:
def meth()
stmt1
stmt2
stmt3
end
maps to the following Javascript:
function() {
return (stmt1, stmt2, stmt3);
}
where each statement (stmt1, stmt2, stmt3) is translated into an expression.
Of course functions have some overhead.
What I've not thought about is "return" and "throw", which are statements as well, so that the approach given above does not work for them.
DragonFly is getting support for running a virtual kernel multi-threaded, i.e. a SMP virtual kernel. A virtual kernel is a userland process (like running any other application) that runs a whole kernel. It's especially suited to make kernel programming life much easier, because you can now test a new kernel within seconds while you don't have to reboot your machine where you are working on.
I think this is required to ease Matt's work on the new HammerFS clustered file system.
A summary by Matthew Dillon about the bugs in Intels Core processors.
"YARD is a C++ framework for constructing recursive-descent parsers from parsing expression grammars (PEG) at compile-time."
Thinking Forth and Starting Forth. Especially the latter is very nice, with lots of pictures in it.
This article explains neural networks and gives an implementation in Erlang. While the approach using one process per perceptron must be dead-slow, it's simple to implement. I also thought about implementing pulsed neural networks using Erlang, but it's a hard problem to simulate a pulsed neural network in a parallel/distributed fashion, and for most neuron types (and small networks) it's just too slow.
Take a look at this picture. I fear chips are becoming too complicated. Also read the article Intel Core 2 considered evil. The author Theo de Raadt -- founder of the OpenBSD project -- does not recommend to buy any Intel Core 2 processors until these issues are fixed. Now I'd like to see whether AMD has less errors in their CPUs.
OpenKeta is a webserver running inside the kernel. As such it saves some time required for switching context between kernel and user-mode. It's available for Linux and FreeBSD. I tried to compile the kernel module for FreeBSD. It compiles easily, but it doesn't run. I tried to find the reason, hacked around in the sources, but in the end give up. Luckily it didn't let my laptop crash :)
RubyJS is a Ruby to Javascript compiler I wrote a few month ago. One of it's main features is that you can use all meta-programming tricks you know from Ruby, as before it compiles the Ruby code to Javascript it executes the top-level code (require, attr_accessor and stuff like that). So you have the full power of Ruby.
Now I took a second thought on how to compile the code in a most efficient way. The old version used rubyjs_send() calls everywhere. It's the equivalent to Ruby's send. That's of course inefficient, at least a bit :). And of course the code grows. The only problem we face is to reliably implement method_missing as this does not exist in Javascript. One method is to check an attribute for null-ness before calling it. But that introduces a lot of code. The other approach, which I am going to use, is very simple. Every attribute (i.e. method/function) we call is defined in Javascript. The compiler knows all method calls, so it knows every method name that will ever be called. Now, we simply generate for each method name an attribute in the Object.prototype, from which all objects inherit. Every attribute is assigned the method_missing function. Subclasses can overwrite this of course. Simple eh? And efficient, hopefully. Don't know how prototypes are implemented in Javascript, hopefully thru a reference and not by copying every attribute.
It's just that I don't have so much time implementing all those changes and writing something like GWT. Or are there any sponsors? :)
You need a JSON parser in C++? No problem! I wrote one for my spiking neuron simulator. It's available here. Well, don't use it yet when strings contain escaped characters, as I don't do any escaping yet. I use Ragel for the state machine. Well, I have to admit that Ragel is very powerful. But on the other side, if you don't read the manual carefully, a lot things can go wrong. In that case it's easier to hand-craft your own parser.
I've not yet decided where to go during holidays. Maybe here?
It's a story about my new pulsed neuronal network simulator. The old simulator (written by some students at University of Karlsruhe) was just too slow. So it was given to me to make it fast. Initially I should focus on the algorithm of the global event priority queue used for scheduling the events. The old code was a big mess, but nevertheless I started to clean it up a bit, removing some fancy ueber-design-issues like strategy patterns (it's not the guilt of the students, we learn design patterns at university :-), or removing lots of code in favor of more understandable code. I got somehow a speedup of around 12%, if I recall correctly. That's not much. And there were other issues like memory leaks. Not to mention the "5000 lines of C++ code" XML loader. I wrote a converter script in Ruby in under 100 lines of code ;-). So in the end I was faced with a decision. Either continue to mess around with the old code, or to throw it all away and start my own simulator. Everyone who knows me, knows how I decided!
It took me a few days to have a running simulator, with under 1000 lines of code. It still had a central priority queue. Nevertheless this rewrite was up to 4 times as fast as the old one. No magic datastructure was used. I compared my PairingHeap priority queue (using a free-list) with that of a STL std::priority_queue, there was no big difference. But my simulator was just faster. I don't know exactly why. All I did was carefully passing data structures around without copying. And with a clear assignment who is responsible for freeing that data structure (I used a free list).
Up to that point, I did not do anything different. This changed as I realized my idea of each neuron having it's own event queue. A decentralized approach. There still is a central priority queue, but this has only as much entries as neurons exist, whereas before it contained as many entries as events exist. A big difference! The number of neurons is constant, whereas that of the events is not. I also extended the implicit binary heap data structure (which has getMin and push operations in O(log n)) to be able to modify an elements priority. With a standard implicit binary heap, you would have to find the element in question, which takes O(n) and that clearly is unacceptable! So each entry simply keeps track of the position inside the heap. Of course that's a bit bad for cache effects, but I think it's still much better than a tree (which has worse cache behaviour due to pointers).
This decentral approach has a lot of advantages. Performance is one of them. It's about twice as fast and can be much faster if merging events in the local event queue that are close together. Thanks to the locality, each neuron can specify how to store events locally. So there is much more room for optimizations. Another advantage is that it is also capable of process oriented simulation, as each neuron (and each synapse if desired) can tell the simulator when it should be activated the next time.
What else did I do for the simulator? Well, I implemented a JSON parser in C++ using Ragel. And yeah, I learned if you don't have a garbage collector, then better use reference counting.
Initially, I implemented the spiking neural net simulator using C++. The result turned out to be very performant. The code was pretty clear as well, except the separation into header and implementation files, which can get pretty ugly in C++, e.g. when you want to use inline functions etc. So far, everything went pretty well. But once I started to implement a loader for the neuronal networks and event lists, C++ turned out to be very ugly. Actually, not C++ itself, but not having a Garbage Collector. It turns out to be a mess: "Who is responsible for freeing that string". Now you spend most time on issues like this and keep you away from doing the important stuff.
Of course I was aware of that before. That's why I reimplemented the whole simulator in D. D is a nice language combining the performance of C++ with a nice module system (away with those ugly header files) and a garbage collector. But I had to realize, that it's not yet production ready. Or maybe, I'm doing something wrong. At least it's 4 times slower, and once I use a larger network, it starts trashing.
So what are the choices? Java? I think, Java would be much slower. Ocaml? Maybe.
My study thesis is about improving performance of our event-oriented pulsed neuronal network simulator Inspire. As most of the time of the whole simulation is spent inside the event priority queue (getting out the next element or inserting future events), it’s clearly the point to optimize. Now, the most recent empirical analysis of priority queues is from around the year 2000. It’s clear that during the last 7 years the performance of processors has increased by a factor of around 25 (Moore’s law) whereas the performance of memory has not increased that much. In short, cache behaviour got more important!
I first got really disappointed after I benchmarked my Pairing Heap priority queue implementation against the std::priority_queue of C++ STL. std::priority_queue uses a simple inplace heap algorithm. It is very fast if you only use Integers for priorities and nothing else. But that doesn’t help you much. It gets slower the more memory the item structure uses which contains the priority, just because it has to copy sizeof this structure around log(n) times for each insert and delete. Then I benchmarked the same with priority_queue<Event*> (instead of priority_queue<Event> or priority_queue<int>), and that showed up to be slower than my Pairing Heap implementation. A very strange thing also appeared as I tried a structure 16 bytes in size via priority_queue<Event>. A bad_malloc exception was raised for some reason. Maybe it’s due to some strange memory allocation techniques used in the element container “vector<Event>”.
Next, I’ll benchmark my calendar queue implementation. But I fear it will be slow until I implement some kind of dynamic (intelligent?) bucket-width or bucket-size heuristics. During the implementation I got the feeling that a calendar queue, while being theoretically pretty simple, is an ugly thing to implement. Just for example, it is very hard (at least for me) to implement negative priorities correctly. There is more ugliness like O(bucket-size) behaviour when no event is found within a “year”. I hope that most of those problems go away with the DSplay priority queue, which is a combination of a Splay tree (or any other kind of fast priority queue which is optimized for a smaller number of element), a kind-of calendar queue (it’s something very different indeed) and an overflow list. The good thing is that parts of it could be parallelized, e.g. sorting the unsorted buckets. I have a lot of plans for improving the DSplay. E.g. instead of a single Splay tree, I’d suggest to use a Splay tree for newly inserted events, and a sorted list for those events that come from the bucket. So, as already mentioned, the sorting could be done in parallel (if that makes sense for a small number of elements).
In the past I learned that it’s nearly impossible or at least very hard to build reusable components, especially in the web application domain when building custom sites (and not a “Web-GUI”). Nevertheless, I always try to write reusable code, even if it will never be reused. The positive side-effect of this is, that my code is usually much easier to read, is much less coupled and easier to change. Of course it’s also less code. Whenever you think “make it reusable” you reduce the coupling and as such increase code-quality. Of course you should stop at some point and should not spent too much time to improve the code.
I think the same applies when you code in an object-oriented manner. You think about how to organize your code into classes in a way that makes sense, reduces coupling and reduces code size. Again, stop at some point and don’t use too many classes. Too many classes make your code again unreadable and hard to understand.
If I’d design a new language including the runtime-system, that language should be/have:
- A very small core.
- Mixed run/compile-time typing. Many types can be infered at compile-time and fast running code can be generated. If not, fall back to runtime typing.
- The compiler is part of the system. Why do you want to generate binary code? Just ship the source code (of course you can obfuscate it; well, Java byte code can be disassembled back to Java source code very well!) and the compiler will compile it on-the-fly to machine code.
- Structural subtyping. No tag-types as in Java. Tag-typing (typing based on simple names like “Array” or inheritance-relationship) can be simulated by special structure values, e.g. {tag_type: ‘Array’}.
- Compacting non-conservative GC. A GC that never leads to memory leaks!
- Region inference. What is the life cycle of a memory object? Is it local to a method/function? Allocate memory based on this information.
- Keyword arguments.
- Kind of selector-namespaces
- No global variables (can be easily simulated by a singleton class).
- Maybe Multiple heaps (thread local heaps), so that GC can run in parallel.
I’ll discuss Open Classes, Extension Methods and Implict Defs in Ruby, C# 3.0 and Scala.
You know Ruby has open classes, i.e. you can add methods at any time to them or remove them. In Java this is not possible. Imagine you are using a third-party library in your code and then later the interface of that third-party library changes. Imagine further that this classes of that third-party library are used all over your classes. Of course usually this will not happen! But just imagine. Adding the old behaviour to a subclass does not work, as you’d need to change the types of all method definitions. So the only chance you have is to rewrite all your code that uses those classes. Next time better wrap those third-party libraries into your own classes which you can controll! But now it’s too late.
In Ruby, you’d simply add a new method to the third-party library class. That’s easy. But of course this is also dangerous, as you can modify all classes at any time, even the standard classes like Array or Integer :). C# 3.0 and Scala have solutions for this problem called Extension Methods and Implicit Defs which I’ll show an example of each below.
See this C# 3.0 example:
static class HelloWorld
{
static void Print(this string s)
{
Console.WriteLine(s);
}
static void Main(string[] args)
{
string s = "Hello, world";
s.Print();
}
}
You have to declare a “static class” and within a “static void Print(this string)” method, which is an Extension Method for class “string”, but only locally in the scope of the static class.
In Scala you can achive the same using Implicit Defs or so called Views:
final class StringWrapper(str: String) {
def Print {
Console.println(str);
}
}
object HelloWorld {
implicit def stringWrapper(str: string) = new StringWrapper(str)
def main(args: Array[String]) {
String s = "Hello, world";
s.Print();
}
}
I personally find the way Scala handles this much easier to understand than the same in C#. Also the use of “object” instead of “static” for class methods is much more intuitive.
There are many languages that replace C or C++ for most of todays tasks, e.g. Java or C#. But when you need absolute performance or want to code near the system level, those are again no choice. This is where D shines. It has basically all features of Java or C# but compiles directly to machine code. Some advantages against C/C++:
- no header files anymore (modular compilation)
- no preprocessor, no ugly macros
- garbage collector (+ use your own memory allocator)
- build-in strings and arrays
- advanced template support
- lazy evaluation
- Small things like multi-line strings, which can be very annoying and tedious in C/C++.
It’s really a great language. Lots of mistakes in C and C++ have been removed from the language. It can be easily mixed with C code and it’s damn fast.
I’m currenty busy sitting in the university library, chatting with Rameen while reading a very good Linux Magazin edition (02/07). In this edition, I read about Comet, which is something similar to AJAX, with the difference that the client application has a connection to the server open the whole time, so that the server is able to send updates to the client. So the client does not need to poll the server regularily. This has many advantages. Of course the server needs to be able to handle a lot of connections, which is impossible with one connection per thread.
Then I read about the AJAX browser-backbutton issue. I implemented this myself a few weeks ago. But it was nice to read how it works with Internet Explorer using an iframe, because this was missing in my implementation.
Now, I got to a very interesting article about Mercury, a declarative Prolog-like language. I read about it years ago, but now it sounds very interesting to me again. As I read the article, I came across modes and especially the two modes “di” (destructive input) and “uo” (unique output) remind me a lot to Concurrent Clean’s uniqueness typing, which avoids Monads as used for example in Haskell to provide “side-effects”.
I’ve ever been interested in programming languages, so I regularily look out for new languages, evaluate them, criticize them, and maybe use them. What I find curious is that most of the time, new languages concentrate on theories and on new techniques. They sound very good in theory. But what is often forgotten is the human. Readability, simplicitiy and understandability are from my perspective key-points of a language. They avoid mistakes and enable larger systems. And in my opinion this is where Ruby shines. I’ve not yet encountered a language that has a nicer, cleaner syntax than Ruby. Erlang has very nice properties like scalability, concurrency, pattern-matching and more, but it leads to code that is more difficult to read. I also read about Scala, a functional object-oriented language. The feature-list looks nice, but then I took a look at the implementation of futures in Scala here and I got disappointed quickly due to a “not so good” readability. Look at all those type declarations. I agree, that this is an advanced example and that “normal” Scala code looks similar to Ruby code. It’s a really nice language. But maybe it tries to do too much. It’s very powerful, but maybe that power, and all the features it provides, makes it harder to understand the language! Ruby instead, tries to be a simple, pure-OO language. Nothing more than OO. Scala tries to be functional and OO at the same time. It mixes Java syntax, ML syntax, Erlang syntax (! as send :) and features of all of them together. But you can’t use it in situations where Erlang shines (well Scala has similar message-passing as Erlang). On the other side, Erlang is not so good at modeling complex abstractions, as you can find in business systems. In this situation OO is just superior. Hm, I think Scala is worth looking at it. Definitively! Erlang as well! After you read this Haskell vs. Erlang thread you won’t consider Haskell for concurrent, distributed applications. Hm, Scala reminds me a bit to the Moby language (syntax and feature wise), but too sad that Moby was never really released. The more I read about Scala, the more I like it.
Dragonfly is a spin-off from the FreeBSD 4.x line of development and is mainly oriented towards cluster-computing, which is getting more important every day. But that’s not it’s primary goal, it runs as well on a simple desktop machine (I tried that in the past, and it runs as good as FreeBSD). What’s new in 1.8?
- Virtual Kernel: Runs a kernel in user-space, similar to User Mode Linux. This work is very important for future work, mainly for testing the cache coherency layer and clustering, which will be implemented as part of the kernel.
- nullfs: arbitrary stacking
RubyJS is now working pretty well! The tests are now compared with the results of running the same test with the “real” Ruby interpreter.
Now I’m working on object dispatch. What I mainly want is to treat Javascript DOM nodes as RubyJS objects. So you can do:
def test
e = `document.getElementById('id')`
p e.class # => DOM::Element
end
I avoid to wrap the objects into my own objects and use singleton classes. This makes it much easier to pass those objects around between Javascript code and RubyJS code (no conversion needed). And it should save some memory. Only problem is that Konqueror doesn’t implement ‘instanceof’ for the DOM objects. But it’s possible to do the same in Konqueror via signature testing.
I lately came aross Google Web Toolkit. It’s amazing. Except, well, Java. If you write the server side in Java, perfect! If not, then you have a mixed feeling. But really, it’s a great toolkit, except that I’d like to be more free in how and what is generated, and I’ve not yet understood why they are generating all those files. Then I dislike Java’s package naming which leads to very deeply nested directories (which is okay if you use an IDE).
Then I saw Pyjamas aka py-gwt. Wow! Impressive! Much better to use Python instead of Java. And more important, it’s easier to modify parts of the generator!
I also read about converting Ruby to Javascript as described here.
Before I started my own :). You can get the code from here: http://ntecs.de/hg-projects/. Or use Mercurial to get your own repository:
hg clone static-http://ntecs.de/hg-projects/rubyjs/
So what is RubyJS?
In simple words: It converts Ruby to JS. You have to write some kind of “static” code, i.e. the converter introspects the classes as they are in-memory and then converts this version to Javascript. You cannot neither add/remove methods at runtime nor add classes at runtime. Method calls inside the same class (self as receiver) can be compile-time checked (not yet, but simple to add in one line). Inheritance and Mixins are supported. Compile-time constant lookup. Class methods as well (and inheritance of class methods). My brain still hurts from thinking about Ruby’s Metaclass model. Even the core classes are written in Ruby (with inline JavaScript). Method calls on immediate objects like true, false, nil and numbers are supported. Other objects can be boxed (numbers can be as well if you like). So it converts “1 + 2” into a method call, instead of assuming that JavaScript’s ”+” operator will work right!
Of course a lot is still missing, mainly compiling constrol structure, blocks, instance variables and of course the whole core library.
But it is really cool, because you can use all sort of Ruby’s meta-programming tricks, like “attr_accessor” for free. Or “require”. It executes as Ruby before it is being translated.
Update 1
Instance variables and super calls are now implemented.
Update 2
Compile time selection of method for calls within the same object (includes “super” calls).
Update 3
implemented iterators, while, break
In my last article I only tested sequential performance. Later I wrote a multi-threaded version and noticed that performance dropped drown a lot (from 3000 to 2000 page requests/second). Well, multiple threads are essential as the performed action usually takes longer and might block.
I made some basic non-professional performance benchmarks of a simple “Hello World” application between Mongrel (proxied through Lighttpd) against a SCGI-adapter via Lighttpd. This shows that the SCGI adapter is more than twice as fast. I also benchmarked AJP13 – the Apache JServ Protocol, a proprietary binary protocol. It is implemented in Lighttpd 1.5. There must be something very buddy, because it is around 100 times slower :). And it’s definitively not my implementation. There are two major differences between SCGI and AJP13. First, AJP13 reuses connections (but does no multiplexing like FastCGI). This should be a little faster. Second, it encodes common headers like “Content-length” as a single integer. But it has some very ugly inconsistencies for my taste. The SCGI implementation in Lighttpd 1.5 seems to be less production-ready as the same for Lighttpd 1.4. For example, when a SCGI connection is closed, lighttpd tells you about it, even if it is part of the SCGI protocol that the connection must be closed at the end of the transfer. One advantage of a structured response back to the server (like AJP13 does) is that the response parsing step in lighttpd could be easily ommited. And of course less data is sent across the wire. Less memory usage. Less connection-establishment overhead. So, once lighttpd 1.5 gets production-quality, I might write my own high-performance Ruby adapter.
At the time when Rails RJS Templates were announced, I though: “Wow, that’s cool stuff”. Since then, I used it in several projects. Well, “used” means here, that in most cases I simply made an element showing an effect etc. I never really thought about why I am using RJS. Because it easily plays together with Rails link_to_remote helper methods. Maybe that’s the reason why RJS is being used. The link_to_remote already takes so many arguments, and in my experience all that stuff really makes a view looking pretty ugly. You see a lot of Ruby code inside your HTML, which makes it very hard to see the structure of the HTML (and the sense of the Ruby). But that’s another topic. Back to why “I consider RJS harmful”.
Actually, it’s so easy to do the same on the Javascript side. Just add an AJAX-onSuccess event and put there your code. So I really don’t see any advantage of being able to write in Ruby:
page.select('#content p').each do |element|
element.hide
end
instead of the same in Javascript:
// jQuery
$('#content p').hide();
The Javascript is even much more terse. In general I think that it’s a bad idea to generate Javascript code on the Ruby-side. Better design your Javascript application well! And don’t mix up Ruby with Javascript.
In the web-application that I’m currently writing, I use JSON. This results in a cleaner separation between server and client side. You have to define clean interfaces and your Javascript (as well as your Ruby application) looks less like a hack.
Another advantage is that JSON should lead to less bytes being transfered over the wire, and it should be much easier (and faster) to generate. It’s also more secure, because you could get rid of Javascript’s eval by using a JSON parser. Keep in mind, JSON is pure data! You could even get rid of your XMLRPC or SOAP API’s completely and only use JSON-RPC. IMHO it’s much easier to implement (hey, I wrote my own JSON generator in just 1 minute using a few lines of code ;-) and much faster than any XML approach and you don’t have that ugly base64 encoding for binary strings in XMLRPC (or XML in general) where you explicitly have to know whether your string contains binary data or not.
Another thing: Someone had to implement that Javascript generating stuff (called RJS in Rails). I’ve not yet seen a use-case where this amount of work would be justified. Less code is always better!
With this patch you can add you C functions as methods which get passed an arbitrary argument upon each call. See what it can do in case of struct.c.
The Friday before my Iran trip, once again I had nothing to do, so I decided to make a competition with “our little ape” Sergey. Who is faster in implementing The Blocks Problem. That’s unfair I know, not only because I should be a more experienced programmer, but also because of Ruby vs. C#. Most problems occured due to the problem description being hard to read and understand. But then, after 1-2 hours I had a correct solution. It can be found here.
I’m lucky that I’ve had the chance to work on a .NET web applications. Only because I now know that it really sucks (even more than Java)! The problem is, that it’s too complex, so that ordinary web programmers don’t really understand it. And the code really looks like that. So many code that really does nothing “useful”. I am really sure that you can write the same in Ruby in 1/10 of the code. And it would even run much faster. And cheaper. Is much easier to maintain, because you can automate so many things that you hardly can in Visual Studio. Oh my god, who is using version control system with Visual Studio? Maybe I’m the only one? There’s no one by default, what a pitty. And the documentation is really bad! It took me 1 hour to find out how to simply add an Application_Start handler. My tip: Just don’t use it :)
As I’ve nothing to do here in Yerevan, I started working on yet another O-R mapper for Ruby. The goal is to keep it small and understandable. Maybe below 200 lines in total (just including PostgreSQL support). And, all this belongs_to, has_many or has_one stuff is no longer needed. Those relations are detected automatically! If you have a database defined according to the naming schema, you can just start off, without writing any line of Ruby code.
Furthermore, only those attributes you set will be written to the database. In this case, either NULL or a DEFAULT value specified in SQL is used. And I’d like to implement the detection of modified attributes, so that only them are updated, not all!
SQLAlchemy is a pretty nice Object Relational Mapper (ORM) for Python. But IMHO, they reinvent SQL for table definition as well as querying. In the end, I find it too complex and non-intuitive to use. I don’t say it’s bad, but I wouldn’t use it directly.
What I’d really like to see for BSD (FreeBSD or DragonFly) is something like upstart for Linux or launchd for OSX. I like upstart a lot more and dislike launchd’s XML config files. Such a daemon should make crond and inetd superfluous. And it monitors processes, so in case a process abnormally exits it would get restartet. This makes the whole system much more stable (imagine accidentally shutting down ssh).
I am a big fan of disconnected and decentralized behaviour. For exmple for Sourcecode Version Control as found in Monotone or Darcs. Centralized systems are usually much easier to build because you don’t have to deal with synchronization and conflict resolution issues. In decentralized systems you can easily run into conflicting situations that cannot be resolved automatically. That is the biggest downside of decentralized systems. For me, the biggest advantage of a disconnected system is that, well, as the name suggests, you don’t need a network connection. I am now in Armenia and I don’t have all the time internet connection. So I can’t use any web-application anymore. Imagine a web-based photo-album service or just a blog. You have to wait until you get the connection again. That is kind of silly, IMHO. Of course I know that most of the “normal” users don’t have this requirement. And the situation with network connections are getting better every year. So in a few years we might have ubiquitous access to the internet. But wouldn’t ubiquitous internet access be the death of all desktop-applications? You can’t use them in an internet-cafe for example. Maybe, but probably not. Better you design your application (iff it makes sense) as a web-application or provide an additional (maybe less feature-rich) web-interface for it. At the same time you can make it run locally in a browser using fancy Javascript stuff. So you can get rid of the extra GUI interface. This way, you can use the application locally as well as via the internet (but not both at once). Running it locally is for sure a very silly idea if you want to show your frieds for example the photos you made. So this is the reason to introduce decentralized, disconnected behaviour where you have a local database running, a local web-server (or just any other kind of GUI if you like). But you are also able to synchronize with the same application running somewhere “in” the internet.
Approach
What would it need to make for example a simple blog-engine decentralized? Well, at first you need of course a local database. Take sqlite! On the server side you can still use any other database if you like. The main thing is that you record deletions. Because otherwise, if you synchronize they probably reappear in your local database. So it’s a good idea to record each change. Inserts, updates and deletions. Together with a timestamp, so that upon synchronization you can figure out which precedes which. If you use simple integers as primary keys, then it’s little bit harder to synchronize. Because they differ with a very high probability from local to global database. So you have to translate local ids to global ids, for example if you update a record, you have to figure out which record this is on the global database. Whereas if you’d use a primary key that represents the content of the record (e.g. SHA1 sum), it’s very easy to compare.
One thing that most annoys me in using Rails are the generators. While wizards are quite different than generators (wizards generate code you don’t understand), they have one thing in common: They generate code, code that you have to maintain, code that contains lots of redundancy, duplicate code. If you know The Pragmatic Programmer read The Evils of Duplication. While code generation enables rapid prototyping and development, once this code has been generated, it has to be changed by hand. For very customized applications this is no problem, but for applications that are kind of “redundant”, you’d better use a different approach. I generally prefer an inheritance based template model, where you can overwrite parts which differ from a base “class”. But I also know, that it’s hard if not impossible to build reusable components, at least for the web. I tried it with Wee and would call it a failure (not Wee, but reusable components).
Les das gerade in der Pforzheimer Zeitung: Wissenschaftler von der Uni Karlsruhe haben angeblich den weltweit ersten atomaren Transistor entwickelt, der mit Hilfe nur eines einzigen Atoms den Strom schalten soll. Congratulations! Jo, unsere Physiker sind schon verdammt gut.
In the last couple of days, I wrote a lot about databases and their implementation. Mostly about replicated databases and their techniques (MVTO, quorum).
Replication is only one technique to scale-up a database, increase read-performance and availability, but mostly not so much write-performance. Whether write-performance is increased depends heavily on the granularity of database objects considered for conflicts in transactions. If you use tables, then this quickly becomes a bottleneck. So better choose rows. This increases concurrency, but also the management overhead. One problem with rows as “objects”, is their primary key. If you use a sequence, this then becomes the bottleneck (only for inserts). A solution would be to use a sequence generator per database node, each generating unique sequence numbers (just prepend the node-number for example). This way you could achieve large concurrency for both inserts and updates. Updating different rows could happen on different nodes without conflicts.
The problem with replication is that it’s not easy to add on existing databases. And trying to build your own (like me :) is doomed to failure. So what’s the alternative?
Partitioning
Partitioning the data into independent pieces is always A good idea™, if not the best. Each node owns a part of the data, so no or little synchronization or conflict resolution has to happen. And it’s quite easy to implement on top of existing databases (just write an interceptor which understands SQL and distribute the queries to the corresponding node). But if your data has lots of (inter-table) dependencies and inter-node data dependencies, then you might be badly off, because then you cannot execute the query on a single node and you have to go and query other nodes. Another problem of partitioning is that it’s availability drops down. One node is less error-prone than five. The chance that one machine out of five is defunct is much higher than that of a single node. Compare it with striping in a RAID system! So ideally you employ both replication and partitioning. Have n separate clusters, each serving a part of the whole data, and within the cluster replicate the nodes.
Analogy with processors
You might ask, what has partitioning (or replication) to do with a processor? Well, in a replicated system, you have to maintain coherency. Most shared memory multi processor systems employ cache coherency (from simple dual-core processors to bigger SMP, DSM or ccNUMA systems). And what do caches do? Caches replicate data! So they have to communicate to keep them synchronized. They do so by using a cache coherency protocol (e.g. MESI – Modified, Exclusive, Shared, Invalid). In a presentation about multi-chip/multi-core processors at my university, I heard (if I recall correctly), that a huge amount of a chips area (I don’t know how much exactly, but I think it was from 10 to 20 %) is spent on implementing cache-coherency. And I think, this was a dual-core system. But I might be totally wrong. While caches increase the performance of the local processor, in a multi-processor system it is a bottle-neck, and it really does not scale. Those systems are easy to program with OpenMP for example, but they just don’t scale. I think the largest cache coherent shared memory system has 256 processors (it’s actually a ccNUMA DSM system). And because it’s a NUMA (non-uniform memory access), meaning that local memory is faster than remote memory (but both are in the same address space), you as a programmer still have to know about that fact. So it’s not a big improvement over message-passing at all.
The cell architecture is a quite different approach. It’s internally message-passing, even if it’s on the same processor. It does not use caches for it’s processing elements. Data has to be transfered into the processing elements local memory. No coherency management needed. Lots of chip area saved, so that the local memory is the fastest memory available. It’s as fast as a register access! And you can even connect multiple chips together. This approach clearly scales. Due to partitioning!
That’s the approach I’m mostly interested in, and which was used in the distributed Backplane database. For fast lookup it’s important that you order the tuples in reverse timestamp order. But you also need to index on the tuple-id. The first tuple you find will be the youngest tuple. If you used a timestamp in the past, you have to search through until you find the tuple with a smaller timestamp than the transaction. It’s important that you first order on the primary key, then on the transaction timestamp. Or at least that’s my understanding.
A Cache-Sensitive B+Tree (CSB+) uses one cache-line per node. A cache line is usually only 64 bytes, so only around 14 32-bit keys fit in there. In comparison to a regular B+Tree, which has node sizes of 4k and above. The trick is to avoid the child-array. Only one pointer to the first child is stored. This points to the node group. The corresponding node is found by simply using the index of the key-array and add it to the pointer. A node group is simply the size of a single node (64 bytes) multiplied by the number of keys per node (e.g. 14).
Another unrelated technique to improve key lookup in a B+Tree node is to use interpolation search instead of binary search. Binary search has a worst case runtime of O(log n). Interpolation search has O(n), but usually finds the key within 3 or 4 lookups. You store the smallest and largest key, calculate the difference and divide by the number of elements. So you know the “step” per key. Then you add the smallest key to the lookup key and divide by “step” and use the result as index. You can also use binary search as a fall-back method, if you’ve not found the key after 4 tries, so you keep the logarithmic worst case.
Another technique that can be used to increase insertion performance is to use gaps within the key-array. By doing this you avoid copying the whole array before you can insert a new key.
While Skip-Lists are quite popular and well known, nobody knows about Skip-Trees. It’s an interesting paper. The only ugly thing is that nodes may have an arbitrary number of keys. And there’s not much known about it’s performance characteristics.
Hm, while thinking about cellular automatons in hardware, I remembered that some cellular automatons have the left and right cells connected together as well as the top and bottom cells, so that the result is a torus grid. Now, in an asynchronous CA it’s unacceptable to wait for a long signal from one end of the chip to the other. But then an idea came into my mind: If you fold the cells two times, at first in the horizontally in the middle, then vertically in the middle, you get a torus grid if you connect the cells at the edges with the cells above. But you also get four layers. Hm, is that possible with the current chip manufacturing processes?
In respect of the asynchronous cellular automatons, I thought about asynchronous chip design in general. In the fourth term of my study, in technical computer science, where we learned about transistors, Boolean algebra, logic minimization and all that stuff, we were also told about the problems that asynchronous designs bring with them, so called hazards (there are hazards of different type, functional, structural… uhm, you know I learned it 4 years ago, so my brain is kind of blurry). A clock makes things tremendous easier, but brings other problems with it. Clock signal distribution is one of them. The clock signal should be synchronous everywhere on the chip. Another problem is, that the is consumes a lot of energy as it’s changing so fast. Another issue is that you discretize the duration of an operation into clock steps. And the clock is determined by the longest signal delay of every pipeline step. So, there is lot of time wasted doing nothing. With asynchronous, data-driven computing, these problems go away. No central clock, no energy waste, no delays. Okay, you loose some performance by waiting for handshake signals, but on the other hand, short operations don’t take longer anymore.
At university, we haven’t learned anything about asynchronous designs. But I recall a very power efficient asynchronous ARM design. In general I think that they are also more failure resilient than synchronous designs, as the central clock has gone (Single Point of Failure).
I’m not sure how asyn. designs work. For example how can be determined when the result of an addition is available? We need to know the signal propagation delay (in synchronous designs we’d choose the maximum of all signal prop. delays). One could use the time needed to load/unload a capacitance, or maybe it would be better to route a signal with the same propagation delay as the addition, and then send a “1” signal on it and wait until it arrives.
With the 100+ of millions of available transistors in todays chips, it would be a nice experiment to try to build cellular automatons in hardware. It’s probably done by some people. But it would be interesting anyways to have 100.000 cells or even more (depending on the complexity of a single automaton) to run in parallel. All this without the need to be clocked. Indeed, that should save a lot of energy, would gain performance (no need to wait for clock propagation). For example, if the cells neighbours do not change, the cell itself has no need to calculate it’s value. Or if it would calculate it’s new value, but this would be the same as before, it wouldn’t have to send an update to the other cells (in this case, every automaton would have to cache the state of all surrounding cells). So in very sparse cellular automatons, this could save huge amounts of energy.
Imagine what you can do with 100.000+ mini-proccessors that work fully parallel (with each running with 1GHz or in the asynchronous case even much more :). okay, it’s not that easy to program cellular automatons, and there are not much algorithms that would run on it. But those that run, would run very fast (sorting for example).
Pre-diploma thesis anyone? :)
Transactions per se do not have anything to do with databases. There are four operations:
- read(object)
- write(object)
- commit
- abort
A partial order is defined on those operations. For example:
t1: r(X) w(X) c
t2: r(Y) w(Y) c
Transaction t1 reads object X, then writes it and finally commits. It’s obvious that those two transaction do not disturb each other, meaning that you can execute them (or parts of them) in any order without giving a different result.
t1: r(X) w(X) ....
t2: r(X)
In the example above, the two transactions are no longer independent of each other (because they access the same object X). You might get a dirty read, depending on the execution order.
r_1(X) w_1(X) r_2(X) ...
If t1 aborts, then t2 has not only read a dirty value, but has also been an inconsistent read. Dirty reads and inconsistent reads have to be avoided. This brings me to the topic of the article.
Out-of-Order Instruction Execution
Modern processors execute the instructions of a program not sequentially but out of order. Instructions have dependencies. There are three types of them:
- True dependency: Read-after-Write (RAW)
- Anti-dependency: Write-after-Read (WAR)
- Output dependency: Write-after-Write (WAW)
True dependecies
MOV EAX, 1
MOV EBX, EAX
You cannot exchange the execution order of those two operations, because you don’t know the value of EAX before execution of the first instruction.
Anti-dependencies
MOV EBX, EAX
MOV EAX, 1
Again, you cannot exchange the execution order of both instructions. But what you can do is, to rename the registers:
MOV EBX, EAX
MOV renamed(EAX), 1
Now the second instruction do not writes to EAX, but to an internal register. Those two instructions can be executed in parallel. But you have to take care, that you later assign EAX the value of the renamed register in the correct order.
This is called register renaming and is used in any modern processor.
Output dependency
MOV EBX, 1
MOV EBX, 2
Again, register renaming is used to be able to execute the two instructions in parallel.
The link to transactions and databases
To increase concurrency, modern databases use Multi-Versioning. That is, they keep multiple versions of the rows. The advantage is that read-only transactions can be executed without being disturbed by writing transactions. There are no conflicts. And even writing transactions can be executed in any order, but you might have to undo them on commit if you detect a conflict with an other writing transaction. That’s exactly what modern processors do with register renaming (multiple version of a register) and out-of-order execution.
What to know more about (database) transactions? These slides are very good!
How are databases used today by peoples writing web-applicatinos? Or what features do you really need?
- Transactions: YES
- FK Constraints: Nice to have, but not neccessary
- Check constraints: Well, you already do that in the model, do you?
- Triggers: BEWARE!
- Aggregations: It depends! Mostly they are too limited, so you do them in any way in your application.
Imagine a very simple database, which supports only little SQL, and you’re using Rails. Would you miss something? I say: NO!
Of course in such a case, where all validation is done in the model, you should not issue SQL commands on your own without using the model. That’s the only danger. But one could live with it. The advantage is, that the database and the model do not duplicate validations, and that the database implementation can be kept at a minumum (you could even get completely rid of SQL!).
What do we really need?
- CREATE TABLE (col type [pk], ...)
- DROP TABLE
- SELECT cols FROM table WHERE cond
- DELETE FROM table WHERE cond
- UPDATE table SET col=val … WHERE cond
And, do we really need JOINS? Essential if you’re going over a wire! So YES! How are JOINs used? I think mostly to connect a tables FK to another tables PK. So how about automatic joins on FK and PKs? And how about FKs and PKs always being integers?
Do we need this?
SELECT t1.a, t2.a FROM table1 t1, table2 t2
WHERE t1.fk_id = t2.id
For ActiveRecords in Rails that’s mostly useless, because you usually get back a model object, so that values from multiple tables cannot be used anyway.
What types do we need? Integers, blobs and user-defined types.
ORDER BY and LIMIT/OFFSET are very useful too, to limit traffic on the wire!
How to implement a database
For most users databases are some kind of magic, that guarantee ACID properties (not all databases do that). But how the database achives this, is mostly unknown. Well, it isn’t that hard! Usually a database uses some kind of B-Trees, that is a balanced binary tree with nodes that contain not just a single value as binary trees do, but contain multiple ordered values. The property of binary trees, that left subnodes have smaller values and right subnodes have larger values apply to B-Tree in the same way. But regular binary trees degenerate very quickly. Of course you can use balanced binary trees, like AVL trees, but they only really work well in memory and not on disk. So, B-Trees are quite simple to implement and are efficient, as they keep multiple values per disk-page.
A page is usually a 4 KB block of memory and should ideally be the same size as the underlying disk-block size. A page stores a B-Tree node or the actual data of the record. Now we have to maintain information about which pages are free and which are not. A very simple approach is to keep a pointer to the first free page, and each free page in turn keeps a pointer to the next free page.
For performance reasons we also need a page cache, which caches often used pages in memory. Uh! What happens if you accidentally plug the power jack? Of course everything in memory gets lost. And how do we guarantee atomicity. For example it might happen, that we insert a new key into the B-Tree, and the B-Tree has to rebalance. Usually this affects multiple pages. Now what happens if we write those affected pages after we’ve modified them back to disk, but in between there occurs an error? We’ve left the database in an inconsistent state from which it’s hard to recover. The key to this problem is a redo-log. This is a file to which we write those changes before touching the database file. We also write a “commit” marker to the log. At least the commit write to the log must be atomic. And important is that we are not allowed to touch the database file before this commit has been written out permanently. In case of a failure, either the log is incomplete (commit has not been written). In this case, we can simply ignore the log, nothing has been written to the database anyways. Or the commit has successfully been written to the log, but during updating the database file, we “plugged the power jack”, so that the database is inconsistent. But we can use the log to redo those changes and bring the DB into a consistent state.
Replicated databases
That’s an advanced topic. There are two major flavours of replication, synchronous and asynchronous. Synchronous replication is slow! You have to wait until each participating database gets the change, before the transaction is finally commited and the client can proceed. So it’s only effective if the access pattern is mostly read-only. Asynchronous replication is (with my limited understanding) mostly used to keep hot-backups, so in case a database goes down, you can proceed with the hot-backup. There’s usually a transaction propagation delay, meaning you might loose already committed transactions, and this is no longer ACID (as transactions are in this case no longer durable). So you can decide between slow and dangerous. Are there alternatives?
Quorum replicated temporal databases
A temporal database is a database, that keeps past records. So records are appended, never overwritten (unless you VACUUM). So the database grows and grows. One advantage to this is, that it’s mostly lock-less. You do not lock a record before you update it. You just write a new record. The old is never touched. Of course you need some kind of locking in the B-tree and page cache and the redo-log. But those locks are only of very short periods of time and are very local. This gives you a very high degree of parallelism. Many threads can access the database at the same time.
But from time to time, you’d have to “optimize” your database. Remove all those old records. That takes time and is best done offline, because in this append-only scenario, you don’t have to keep free-lists, and why would you do something that do can get around? But you cannot go offline with your database, unless it’s replicated. That’s where replication comes to light. That’s a bit more complex then the rest. It needs to be:
- fast
- fault-tolerant
- deterministic
I’m currently trying to understand that topic. There are some issues I haven’t grasped yet. But the priciple of a quorum is indeed very simple. Say you have 5 nodes in total. So the quorum is 3 nodes (n/2 + 1). If you commit a transaction and 3 nodes (the quorum) say “okay”, then the transaction is accepted.
TupleSpaces are an easy and nice way to:
- distribute data across multiple nodes
- decouple the nodes (by using simple tuples)
- let nodes behave data-driven
Two downsides of Rinda, an implementation of TupleSpaces in Ruby:
- The get operation does not scale for many tuples!
- No transactions.
But, if you have too many tuples, you are probably using the TupleSpace in a wrong way. One way to avoid that, is to use credits which are consumed/produces whenever you put/get a tuple out of the space. But that again couples the two processes a bit more.
With no transactions, I mean, that it might happen, that one process takes a tuple out of the space, and later fails, so that the tuple is “lost”. It shouldn’t be too hard to implement transaction, e.g. by using sqlite as storage and then simply use database transactions. By using type definitions for tuples, one could then create on-the-fly the corresponding database table and insert the tuple there (that even works without type definition by only using the number of elements in a tuple as information). Using type information one could improve lookup speed.
TupleSpaces are easy to use. But they are very centralized. And as such, possibly inefficient and a single point of failure.
Yesterday night I hacked a persistent B-tree implementation in pure Ruby. Only key deletion is missing, insertion and find works well. Of course it’s so unoptimized, that smaller block-sizes are faster than larger. This is because on insertion of a key, I write the block back to disk and not keep a cache of dirty blocks in memory. Nevertheless I get a 100-1000 key insertion rate per second. It’s 300 lines of code, and well it’s a hack! I use an “internal” memory management system, meaning that I keep a pointer to the first free block in the header of the Btree file, and each free block points to the next free block. It’s the simplest way to go.
I’m thinking of rewriting it in C using mmap, so that the OS takes care of block caching/management. Don’t know whether this would be portable on Windows, but who cares :). I think it could be pretty fast. But of course you can’t implement your own buffer management algorithms.
B-Trees are really easy to implement! And after that, you really have a small database implemented. No really, you just have to add some meta-data, use multiple B-trees for further indices and of course, which seems to be a bit harder, implement transactions, and you have your own little database. Of course you don’t have query optimization etc. but as SQL is missing anyway, you do it “by hand”. Of course, what else is missing is concurrency. BUT, maybe it’s better to think about avoiding concurrency at all and gain scalability from locality? Let each process use it’s own database, containing non-overlapping entities. Let it operate on it’s owning entities, and from time to time balance (migrate) the number of entities across the databases.
What else is missing in my B-tree implementation? I want to be able to store duplicate keys. Currently only unique keys are allowed. This is fine for the primary key, but not for additional sorting indices. Hm, and how about multiple indices withint one and the same file? Should be easy!
That’s my very own experience. They hinder development especially in the prototyping stage, where the model still is in a very fluent state. Once the model is stable and you’re going into production, migrations are a valuable tool.
Get it here. New features include a tab bar, code completion, visual parens matching and a lot more.
This paper describes an idea (generating Javascript, SQL etc. from one source) I tried to implement about 10 years ago (only the idea was pretty much the same while their approach is really good!). I failed, because my approach ought to be too general. While it’s a very interesting approach, I think it will not become widely accepted. My dream has ever been to have one model from which you can automatically generate the SQL (including check constraints), the server-side validation code as well as client-side validation code. But they all use very different “languages” (SQL, Javascript, Ruby), so the only way is to agree upon a common description language and transform this into each language. Of course, that’s not that easy, and you loose expressiveness of the respective languages.
Several month ago, I worked a bit on Nemo, which is a Ruby implementation of Mewa based on Wee. There we tried to unify the description of storage and server-side validation constraints. In most cases they were mostly the same. But it’s really hard to get it right.
I’ve a table like:
CREATE TABLE feed_subscriptions (
account_id ...,
feed_id ...,
column_position integer,
row_position integer,
unique (account_id, column_position, row_position)
);
Now I want to move certain “rows” one position up or down:
UPDATE feed_subscriptions SET row_position = row_position - 1
WHERE account_id = 1 AND column_position = 1;
Unluckily this fails, even though after the UPDATE statement the constraint holds.
Well, it’s on their TODO list (search for: “Allow DEFERRABLE UNIQUE constraints”).
Yahoo! has released their Javascript UI Library under a BSD license. I’d like to see all those nice widgets to be integrated into script.aculo.us.
Monotone is a distributed revision control system, employing cryptographic methods for ensuring data consistency as well as security (certificates) when used in a distributed setting.
I’m currently learning for the system architecture (SysArch) exam and solving the assignments. For example:
“Suppose you have to desgin the on-board computer for your next trip to Mars. What system goals would you try to achieve with your design?”
Of course the answer is quite easy. It has to be correct, robust and (power-) efficient, but not scalable:
“Scalability of your system, however, may not be a major goal as you are unlikely to find hardware upgrades along the way and you have way different problems when an armada of alien ships forcibly joins your on-board computer into their peer-to-peer system.”
FreeDarwin is a new OS that is based on Darwin, but tries to replace the build-system with what it used in BSD. And it will make use of NetBSD’s pkgsrc.
Aaron Patterson has taken over the development of my WWW::Mechanize library. With this library you can fetch and parse HTML pages, trigger form-submissions and stuff like that. In a single word – you can automate. It’s quite handy for simple stuff, but is limited if there’s lots of use of javascript in the pages.
It has now it’s own project page at rubyforge.
Zuerst dachte ich, wow! Aber als ich dann gelesen hab wofür OCFS2 steht, musste ich irgendwie lachen: Oracle® Cluster File System :)
Ein Grund mehr nicht auf Linux umzusteigen grins. Macht doch bitte was gescheites Jungs ;)
Das Online Buch Java ist auch eine Insel in Version 4. Bin ich grad beim Lernen von System Architecture drüber gestolpert. Ich hoffe nicht, dass ich irgendwann mal dieses Buch (resp. Java) benötigen werde :)
OpenMP works fine with GCC 4.2. I tried it with the following hello world example:
/* hello.c */
#include <stdio.h>
int main(int argc, char** argv)
{
int myid, numprocs;
#pragma omp parallel private(myid, numprocs)
{
myid = omp_get_thread_num();
numprocs = omp_get_num_threads();
printf("Hello, I'm thread %d of %d\n", myid, numprocs);
}
return 0;
}
Now lets produce an executable:
gcc42 -fopenmp -c hello.c
gcc42 hello.o -lgomp -lpthread -o hello
And execute it with 4 threads:
export OMP_NUM_THREADS=4
./hello
What you get will look like:
Hello, I'm 0 of 4
Hello, I'm 1 of 4
Hello, I'm 2 of 4
Hello, I'm 3 of 4
This is great news! Now looking forward to try out OpenMPI, not to be confused with OpenMP!!! It’s message passing and not shared memory, completely different thingy. Harder to program with (hmm, not always), but much more scalable, as you can distribute the paralellel programs on your very own computer cluster (take 4 PCs and connect via Ethernet and your done). Quite a lot cheaper than SMP systems, but of course performs not that good for a low number of processors. If you have more than 64 processors in an SMP system, it’s probably better to switch to message passing :). But if you’ve so many processors, you should be a lucky man, don’t you?
Hab mir gerade eben beim Essen die P.M. durchgelesen und bin dabei auf einen Artikel über Quantencomputer gestossen und musste mir anhören wie schneller die doch seien weil ja ein Qubit zwei Zustände gleichzeitig einnehmen kann usw (bzw. bei Verschränkung mehrerer Qubit dann 2 hoch n Zustände gleichzeitig). Das übliche Gefasel eben. Leider zeigen die Autoren eben nicht die grossen Probleme beim Quantencomputing auf (ob mangels Kenntniss kann ich nicht beurteilen).
Ich bin absolut kein Quantencomputer-Experte! Habe nur mal eine Vorlesung über Quantencomputer-Algorithmen gehört, sehr viel Mathe und Physik, unendlich-dimensionale Hilberträume (ich mein die hiessen sogar anders) und Tensor-Produkte, seitenlange Formeln, viel Wahrscheinlichkeitstheorie und all so Zeugs von dem ein normalsterblicher Mensch eben keinen blassen Schimmer hat (ich auch nicht besonders viel :).
Nun, wie funktioniert das ganze eigentlich? Also man hat n Qubits (die physikalischen Probleme seien mal ausser Acht gelassen). Nun bringt man diese mittels einer Hadamard-Transformation in einen gleichverteilten Zustand. Das ist nichts anderes als ein perfekter Zufallsgenerator! Hallo? Zufall? Jaaaa! Das bedeutet, dass jeder der 2 hoch n Zustände, die die n Qubits einnehmen können, gleichwahrscheinlich auftreten kann. Man könnte jetzt denken, es wäre leicht das ganze auf einem herkömmlichen Rechner zu simulieren. Nein, ist es nicht, da man eben nicht n Wahrscheinlichkeitswerte benötigt, sondern aufgrund der Verschränkung der einzelnen Qubits eben doch wieder 2 hoch n. Naja, ist ja auch egal, letztendlich können wir eine oder mehrere Operationen auf alle Zustände gleichzeitig anwenden. Immerhin reduziert das die Komplexität von exponentiellem Wachstum O(2 hoch n) auf O(1). Das ist seeehr gut, wenn es da jetzt nicht ein klitze-kleines Problemchen gäb.
Das Problem
Und zwar ist das Problem, welches der 2 hoch n Ergebnisse wir auswählen wollen. Wir können uns nur für eines entscheiden, ansonsten hätten wir ja wieder eine Komplexität von O(2 hoch n). Ausserdem erlaubt’s die Physik auch nicht anders. D.h. wir können nicht einfach mal sagen, wir lesen Ergebnis Nummer 1, wenn dass nichts ist, dann Ergebnis Nummer 102 usw. Dazu müssten wir die komplette Berechnung erneut durchführen (was ja wegen O(1) eigentlich nicht so sehr das Problem ist, allerdings ändern sich natürlich die kompletten Ergebnisse bei jedem neuen Versuch, damit auch Ergebnis Nummer 1 und 102!).
D.h. ich muss aus 2 hoch n Ergebnissen, dessen Werte ich ja nicht kenne, eines aussuchen. Ganz toll! Das bringt mir – absolut gar nix! Okay, jetzt gibts natürlich noch total komplexe Algorithmen und Ansätze, die alle auf Wahrscheinlichkeiten basieren, damit z.B. ein spezielles Ergebnis wahrscheinlicher ist als andere. Da kenn ich mich jetzt aber nicht mehr so gut aus. Ist ja auch schon wieder fast ein Jahr her :).
Auch muss jede realisierte Funktion (die physikalisch realisiert werden soll) umkehrbar sein, d.h. es gibt keine Fixpunkte mit denen man arbeiten kann.
Und ein weiteres Problem: Bisher gibt es kaum Algorithmen! Zudem ist es wohl noch nicht möglich einfach mal eine Funktion in einen Quantencomputer einzuprogrammieren, so wie man das von Software her kennt. Man braucht da noch spezielle Versuchsaufbauten usw.
Deswegen war meine Schlussfolgerung, das es für mich kaum Sinn mach, abgesehen vom Interesse natürlich, mich weiter mit Quantencomputing zu beschäftigen. Zumal man vor 40 Jahren schon gross das Parallel-Computing proklamiert hat und es sich heute erst durchsetzt. Mit Quantencomputing wird das sicherlich ähnlich sein. Warten wir 50 Jahre ab. Dann gibts auch Kernfusions-Kraftwerke bzw. NoGraviTops (schwebende Laptops aufgrund von Antigravitation) mit Fusions-Batterie, gesteuert direkt durchs Hirn…
I’m currently writing my Curriculum Vitae for my summer internship somewhere on the planet earth. I try to use LaTeX with CurVe.
At first, how to install a LaTeX add-on? Simply run:
latex curve.ins
latex curve.dtx
Then copy curve.cls and curve.dvi into /usr/local/share/texmf-local/tex/latex/curve and don’t forget to run mktexlsr! Now you’re ready to use it.
Processor complexity grows towards infinity. I’m currently reading Intel® Virtualization Technology for Directed I/O Architecture Specifiction. What they are doing now is the same for the I/O address space as they did starting from the 80286/80386 with the memory. Virtualization! We now get I/O page tables, that allow remapping of addresses and protection based on page granularity. And of course that requires IOTLBs – I/O Translation Lookaside Buffers. We currently have them to cache the translation of virtual into physical addresses, which is indeed very slow to calculate as it requires multiple memory lookups because the page-table is stored in memory (which hopefully is cached).
DMA Remapping? Currently, I/O devices can access any physical address they want. There’s no protected at all. With the new I/O virtualization, each device can be restricted to a specific memory frame.
Wow! Very complex reading. I stop it for now. But my next computer will definitively include hardware virtualization support, and of course will be 64-bit :). Because this would allow me to run Windows, BSD and Linux as guest systems under Xen in parallel (well, at least concurrently but of course it will be at least a dual-core chip).
The best free UML tool is JUDE in my opinion. I tried a lot alternatives, be it ArgoUML (or it’s commercial derivate PoseidonUML), but didn’t liked them. Umbrello, the KDE UML diagramming application is not comparable to JUDE, but it can export as XMI, which is a nice feature.
In short, JUDE is easy to use, runs fast and generates IMHO nice diagrams. It supports all diagram types. It’s fun to use, if you have to.
Heute war der lustige Prof. Daniel Gajski vom Center for Embedded Computer Systems der University of California, Irvine bei uns an der Karlsruher Universität zu Gast und hielt einen Vortrag über seine Entwicklung auf dem Gebiet der Hardware/Software Co-Synthese von eingebetteten Systeme, die es erlauben soll ein System bestehend aus Hardware und Software komplett in C zu beschreiben. Seinen Ansatz nennt er No-Instruction-Set- Computer (ein Konzept ähnlich dem ASIP von Prof. J. Henkel) und proklamiert damit eine Steigerung der Entwicklungsgeschwindigkeit um das 1000-bis 3000-fache zu erreichen.
Das RiskJOBS Portal ist heute morgen online gegangen. So ganz unbeteiligt war ich nicht daran :). Programmiert hat es mein Bruder von MetaCreaTIC und nya ich hab da früher auch ein Teil von programmiert :). Ist richtig gut geworden! Und mit dem neuen Server auch richtig schnell.
Backups sind wichtig! Zwar monitoren wir unsere Platte und planen auch die Platte regelmässig auszutauschen, jedoch ist man nie gegen einen Datenverlust gefeit. Es gibt viele Möglichkeiten für Backups. Und viele verschiedene Programme/Scripts. RSnapshot gefielt mir sehr gut. Es legt zu bestimmten Zeitpunkten Snapshots von bestimmten Verzeichnissen/Dateien an, dabei löscht es ältere Snapshots (es behält nur die letzten n Snapshots). Dies funktioniert auch von einem fernen Rechner aus. Ob man allerdings auch remote Skripte ausführen kann, z.B. um die Postgresql Datenbank zu sichern, weiss ich nicht. Ich meine, remote funktioniert das nicht! Das wäre Kritikpunkt Nummer 1. Weiterer Punkt ist, das zum Ausführen mancher Skripte root-Rechte erforderlich sind. Wir wollen jedoch root nicht über SSH erlauben, also Kritikpunkt Nummer 2. Deshalb hab ich mich aufgemacht, und ein eigenes Backup-Skript geschrieben, in Ruby, das wie folgt funktioniert:
Server-seitig: Bei jedem Aufruf legt es einen neuen Snapshot von den angegebenen Verzeichnisse bzw. Ausgaben von Skripten (DB Sicherung) in einem neuen Verzeichnis an. Das Verzeichnis wird benannt nach der Zeit an dem der Aufruf erfolgt, z.B. “2006-02-11T13:30:21.UTC”. Das Backup wird mit Hilfe von rsync gemacht, d.h. nur neue oder geänderte Dateien werden angelegt, die anderen werden durch hard links aus dem vorangegangenen Snapshot verlinkt. Das macht es relativ Speicherplatz-schonend, bis auf die Verzeichnisstruktur, die jedes Mal komplett neu erzeugt wird. Danach chowned es den komplette Snapshot rekursiv und ermöglicht somit dem backup User den Zugriff (dabei verwende ich ACLs, sodass die normalen Unix-Rechte nicht verändert werden).
Client-seitig: Der backup User kann sich ohne Passwort mithilfe des richtigen privaten Schlüssels einloggen. Auch hier verwende ich rsync um den Abgleich durchzuführen, und lege nur neue oder geänderte Dateien an. Das Skript arbeitet vollständig unabhängig von der Server-Seite. Es erkennt welche Snapshots remote vorhanden sind, und welche lokal bereits existieren, und überträgt dann die Fehlenden, wiederrum basierend auf dem jeweiligen vorangegangenen Snapshot. Das klappt recht gut. Zumindest für den Anfang. Automatische Rotation und so kann man immer noch nachträglich einbauen. Aber man kann ja jederzeit auf dem Server oder auf dem Client einen beliebigen Snapshot löschen, da jeder Snapshot für sich selbst vollständig ist.
Die Folien von kann man sich hier runterladen.
Colemak has been designed as a modern alternative to the QWERTY and Dvorak layouts.
Cool! For those who cannot wait, try out the Omni compiler. It works even great on FreeBSD.
The Xen hypervisor has been released in version 3.0.
On the command line, as simple as:
hdiutil burn -erase image.iso
Remove -erase if you don’t use a -RW media.
Als ich heut morgen den Abschnitt über Prozessoren in der neusten Ausgabe von c’t gelesen hab, bin ich über folgende drei Begriffe gestolpert, die mir irgendwie bekannt vorkamen, aber deren Bedeutung mir nicht sofort einfallen wollte:
- Simultaneous Multi-Threading (SMT)
- Concurrent Multi-Threading (CMT)
- Vertical Multi-Threading (VMT)
Simultaneous Multi-Threading (SMT)
SMT ist durch Intels Hyperthreading wohl am besten bekannt. Es dient dazu, die Funktionseinheiten besser auszulasten, indem Befehle simultan von mehreren Kontrollfäden (Threads) geholt und entsprechend ihren Abhängigkeiten dann zur Ausführung gebracht werden, d.h. in der Instruktion-Queue befinden sich Befehle von mehreren Threads. Da Befehle unterschiedlicher Threads nun gleichzeitig zur Ausführung kommen können, muss das Registerfile mehrfach vorhanden sein (ein Registerfile pro Faden). Das eigentliche Problem der mangelnden Auslastung der Funktionseinheiten (ALU, FPU) ist durch die bedingten Sprünge begründet, die in der Regel jeden 7. Befehl ausmachen, und man eben erst nach vollständiger Ausführung der Bedingung weiss, von welcher Stelle man nun weitere Befehle holen muss (Sprung genommen oder nicht). Durch Vorhersage des Sprunges (Branch Prediction) lässt sich dieses Problem weitestgehend beheben. Allerdings hilft das auch nicht viel wenn voneinander abhängige Befehle ausgeführt werden sollen, womit man wieder bei der sequentiellen und somit nicht superskalaren Befehlsausführung wäre, und viele der Funktionseinheiten unbenutzt blieben. Hier kommt nun SMT ins Spiel, da Threads vollständig voneinander unabhänige Befehlsströme darstellen und somit in jedem Fall mindestens zwei Befehle (bei 2-fachem SMT) gleichzeitig ausgeführt werden können, abgesehen von irgendwelchen Randbedingungen. Zusammengefasst: Instruction-Queue wird aus mehreren Kontrollfäden mit Befehlen versorgt; Registerfile mehrfach vorhanden. Befehle verschiedener Threads können echt gleichzeitig ausgeführt werden.
Concurrent Multi-Threading (CMT)
Beim CMT hat nun jeder logische Kern seine eigenen Funktionseinheiten sowie Register, ist also aufwändiger als SMT, und die einzelnen Einheiten werden wiederum nicht mehr optimal ausgenutzt, wie dies beim SMT der Fall war. Allerdings teilen sich die logischen Kerne andere Ressourcen wie Caches oder die Pipeline und sind damit weitaus weniger aufwändig wie echte Multicore Designs. Zusammengefasst: Pro Thread eigenes Registerfile und Funktionseinheiten vorhanden, andere Ressourcen geteilt.
Vertical Multi-Threading (VMT)
Beim VMT geht es ähnlich wie beim SMT darum, die Funktionseinheiten möglichst optimal auszunutzen. Anders als jedoch beim SMT, das sehr feinkörnig und simultan mehrere Kontrollfäden ausführen kann, stehen beim VMT Befehle die eine sehr lange Ausführungszeit besitzen im Vordergrund. Solche Befehle können z.B. Lade- oder Speicheroperationen sein, also Befehle die auf den immer noch verhältnismässig sehr langsamen Speicher zugreifen, oder aber bestimmte Fliesskomma-Operationen sein. Um zu vermeiden das diese Befehle die gesamte Befehlsausführung blockieren, wir in einem solche Fall auf den nächsten Thread umgeschaltet. Intel sprich hier auch von Switch On Event Multithreading (SoEMT). Ein eigenens Registerfile pro Thread ist also nicht notwendigerweise erforderlich, da Befehle mehrerer Threads niemals simultan ausgeführt werden können. Jedoch muss das Registerfile jedes Threads irgendwo zwischengespeichert werden. Dies kann allerdings ein etwas langsamerer Speicher auf dem Chip sein, da hierauf nur sporadisch bei Threadwechsel zugegriffen werden muss, und nicht ständig wie bei Registern. Daher ist VMT das am wenigsten aufwändigste Verfahren (gemessen an zusätzlichen Transistoren). Holt dagegen aber nicht soviel Leistung heraus wie dies bei SMT der Fall ist.
... of IPv4 address space, as this article makes me believe. In maybe 5 years we’re out of space. For me that are great news, as I’m looking forward to use IPv6 and get rid of NAT and all it’s problems. It’s also much easier for server administration, especially if you want to setup multiple web-servers on port 80, but which each runs under a different user. Currently, this is only really possible using a proxy module, which sucks!
Looks really nice and very useful.
The ZFx86, a great embedded x86 chip which has integrated controllers like EIDE, PCI and USB as well as an integrated BIOS, is live again after a very long and tedious legal battle with it’s chip manufacturer.
Yesterday I discovered the Dvorak keyboard layout. It’s more friendly to your hands than the QWERTY layout, and is said to be more efficient.
Read the Dvorak Zine and try it yourself.
can be downloaded here
The design of the FreeBSD homepage has changed. It looks now much more professionell.
cd directory/to/copy
pax -w -x sv4crc . |
ssh root@remotehost "cd directory/to/copy/into && pax -r -p e"
You should use "root", to maintain correct ownership of files. Use format sv4crc, as most other formats are limited to a path-length of 255.
To double-check that all files were transfered correctly, make use of mtree:
cd directory/to/copy
mtree -c -k md5digest -p . |
ssh root@remotehost "cd directory/to/copy/into && mtree -k md5digest -p ."
If there’s no output, everything is OKAY!
user_pref("network.protocol-handler.app.http", "/usr/local/bin/konqueror");
user_pref("network.protocol-handler.app.ftp", "/usr/local/bin/konqueror");
- Boot from FreeBSD 6.0 CD-ROM.
- Escape to the boot loader and type set boot_askname.
- At the "mountroot>" prompt type ufs:da0s1a
And voila, it boots from the root partition of the first slice of da0. Note that the kernel will be loaded from the CD-ROM and not from the USB harddisk.
cdparanoia -B -- "1-"
burncd -f /dev/acd0 audio *.wav fixate
Rite will be a complete rewRite of the Ruby interpreter, whereas Ruby2 will be the new revision of the Ruby language.
Rite:
- bytecode
- generational GC
- native threads
- new regular expression engine (Onigurama)
- m17n (multilinguization)
Ruby2:
- real keyword arguments (not just hashes)
- new block variable scope
- pre/post/wrap methods
- selector namespaces (?)
- a few other modifications that clean up the language (but no dramatic changes, Ruby will stay the same)
This is mainly a repetition of Matz’s ideas presented at RubyConf 2003 "How Ruby Sucks" (did Guido van Rossum ever told you How Python Sucks? :-). More information on this subject is given in Rite or RiteSuggestions.
end summary::
New Local Variable Scope
Block parameters are block-local, whereas all other variable assignments inside the block do no longer create block-local variables.
But what about block-local variables other than the parameters?
Solution 1
By using more block-parameters than actually used for the arguments. For example:
ary.each do |i, aLocal, anotherLocal|
...
end
This seems odd to me, as it’s no longer clear how each behaves, which variables are actually parameters, and which are not.
Solution 2
Introduce new syntax for block-local variables. For example:
ary.each do |i| <aLocal, anotherLocal>
...
end
Or more verbose:
ary.each do |i|
local aLocal, anotherLocal
...
end
A local * or <*> would then mean, that all used variables are local to the block.
Keyword arguments
Old hash syntax for method calls
Will the following become obsolete?
def foo(hash)
...
end
foo("port" => 8080, "url" => "...")
Instead (Rite):
def foo(**hash)
...
end
Or explicitly (Rite):
def foo(port: nil, url: nil)
...
end
foo(port: 8080, url: "...")
I would argue for making it obsolete, i.e. removing it, for the sake of clearness.
No parentheses?
Will it still be possible to ommit parentheses? Will the following work?
foo port: 8080, url: "..."
I don’t like this when used in combination with keyword arguments, as it reminds me of Smalltalk, but is completely different (first part is a method instead of a receiver/object in ST).
Passing Arguments Through
Will this work?
def wrap_a(*args, **keys, &block)
a(*args, **keys, &block)
end
Of course, it should!
Complex argument lists
Will this be possible?
def foo(a, x=5, *args, b: 42, c: nil, **keys, &block)
...
end
foo(1)
# => a=1, x=5, args=[], b=42, c=nil, keys={}, block=nil
foo(1, 2, 3, 4, c: 5, port: 8080) {...}
# => a=1, x=2, args=[3,4], b=42, c=5, keys={port: 8080}, block={...}
I really don’t want to write a parser for this ;-)
Don’t care argument list
As argument lists become more complicated, wouldn’t it be nice to have a syntax for don’t care? This is especially useful for pre and post methods:
def foo(a, *args, b: 42, c: nil, **keys, &block)
...
end
def foo:pre(...)
# don't care about the arguments
end
Don’t need to write:
def foo:pre(*a, **b, &p)
...
end
Maybe … isn’t appropriate, as it reminds me of ellipses in C++. Probably no big win in clearness, as usually pre and post methods are used for meta-programming and not by the programmer itself.
Another possible solution is to not perform arity checks for pre or post methods, so that argument lists can be ommited, when not needed:
def foo(a, b, c)
...
end
def foo:pre
puts "pre"
end
But that would make them a bit more special than normal methods are, which is IMHO not a good choice.
Pre/Post/Wrap Methods
Great improvement for meta-programming like Aspect Oriented Programming (AOP).
Recursive Make Considered Harmful Peter Miller (auf Deutsch)
Go To Statement Considered Harmful Edsger W. Dijkstra
Sorry, but the article is in German.
www.linuxenterprise.de/itr/online_artikel/psecom,id,407,nodeid,9.html
Imagine you have written a method that works on arrays, using the [] method to access specific elements. Let’s call this method process:
def process(anArray)
...
anArray[i] # access i'th element
...
end
At the beginning there’s no problem with using this method on arrays, as your arrays are small:
arr = Array.new
for i in 1..1000
arr[i] = i**2
end
process(arr)
After some time, your array grows, but not the number of elements in it, say only a few elements in your array are defined, but with indices in the range from 1 to 1 billion.
arr = Array.new
arr[23] = 66.5
arr[25_333] = 4
arr[1_000_000_000] = 8
process(arr)
You’ll probably not have enough memory to try out this example, as arrays in Ruby will automatically grow to their largest index. In the example above, we would need around 4 GB of memory, for only three elements.
Clearly, what you need is a sparse array. In Ruby, that’s simply a Hash:
arr = Hash.new
arr[23] = 66.5
arr[25_333] = 4
arr[1_000_000_000] = 8
process(arr)
Notice how nothing except the first line has changed.
Defining a default value for undefined elements in the hash is simple again:
arr = Hash.new(default_value)
...
...
This would return default_value for non existing indices/keys.
Imagine you now want to call the process method with an array that contains all numbers from 1 to 1 billion in reversed order. Neither arrays nor sparse arrays will help here, due to their high memory consumption. What we need are functions, or in Ruby, Proc objects.
arr = Proc.new {|i| 1_000_000_000 - i}
process(arr)
The function (or Proc object) in the above example can be seen as virtual array. We can get the value of the i’th element in the same way we would have done with an array:
arr[i]
which of course in the case of a Proc object is equivalent to:
arr.call(i)
This nice consistency is what I was talking about at the beginning of the article. Did you get it :-)
The presentation slides can be found here:
But, which graphics library should I use?
I am not looking for a graphics library that rules the world. All I want is to draw directly into the frame-buffer. In the good ol’ DOS days, it was so simple. Two assembler commands to switch into mode 13h (320x200x8):
mov ax, 13h
int 10h
Then I was able to write directly into the frame-buffer (I believe the memory location was 0xa000).
I started with evaluating libGGI, as it’s very portable. But after I was ready with the program, I realized that the pixel routine was too slow. So I tried Clanlib, but it failed to compile on FreeBSD. I though, ok, let’s try libSDL. I read some docs about SDL, but the pixel routine in the tutorial tried to handle all possible color depths. Then I tried Allegro. I used it in the past (yeah, good ol’ DOS days), and found it very pragmatic, not trying to abstract every piece of my hardware.
Allegro
Wow, I hacked up my first little program in 5 minutes. It really looks like programming in DOS. Everything is so simple.
if (allegro_init() != 0)
{
// fail
}
install_keyboard();
set_color_depth(32);
if (set_gfx_mode(GFX_AUTODETECT, 800, 600, 0, 0) != 0)
{
// fail
}
acquire_screen();
// ... draw on screen
release_screen();
readkey();
It is easy to create a bitmap (for double buffering), write directly into it, and copy it to screen:
BITMAP *bmp = create_bitmap(800, 600);
for (int row=0; row<600; row++) {
for (int col=0; col<800; col++) {
((unsigned int*)bmp->line[row])[col] = makecol(255,0,0); // fill red
}
}
blit(bmp, screen, 0, 0, 0, 0, 800, 600);
And peng, it’s displayed on the screen, without delay! That’s all I need for my fractal zoomer. Great!
As it was my first FreeBSD port (I submitted several NetBSD ports in the past), it took me about two hours to make it perfect. But the next time it will require no more than 15 minutes, as it’s really simple, once you’ve got the idea.
Now you can install it by simply typing:
portinstall algae
Or if you haven’t installed portinstall (BTW it’s written in Ruby):
cd /usr/ports/math/algae
make && make install && make clean
Here’s the problem report:
When installed, call gt from the command line.
Nowadays, most modern mail clients like Evolution or Mozilla-Mail include a full-featured SMTP client to send email to a central mail-relay (usually your providers’ one). And they usually support SASL authentification and encryption over SSL/TLS, as your provider will (hopefully) not accept mail for relay by everyone. But other email clients still exist - the famous mutt for example - that do not come with a SMTP client at all.
Installation
Build Postfix with Cyrus-SASLv2 and SSL/TLS support.
Configuration
Edit /usr/local/etc/postfix/main.cf where you add the following lines:
relayhost = mailhost
smtp_sasl_auth_enable = yes
smtp_sasl_security_options = noanonymous
smtp_sasl_password_maps=static:username:password
smtp_use_tls = yes
sender_canonical_maps=hash:/usr/local/etc/postfix/sender_canonical
Replace mailhost with your provider’s mail server, username and password with your own mail account settings.
Further create the sender_canonical file with following content:
@your.localhost.domain your-real-address
On my system that looks this way:
@michael.neumann.all mneumann@ntecs.de
This will rewrite the envelope address from mneumann@michael.neumann.all (that’s my local address) to mneumann@ntecs.de (that’s my remote address). You still need to build the hash database:
postmap /usr/local/etc/postfix/sender_canonical
Finally, make sure the main.cf file is not world readable (chmod 640), as it contains your password!
Notes
If you are your own provider (I am), and you’re using Postfix as your mail-relay, make sure that the permit_sasl_authentificated directive comes before any of the reject_xxx directives. This makes live easier with mail programs that produce mails with missing headers.
Here are some pictures and facts about our new 1U server.
The case is made in Germany (DÄMO) and was indeed the most expensive piece at all:

Have a look inside of it:

The facts:
- Intel P4, 2.4 GHz
- Tyan Trinity Mainboard with Dual LAN (GBit/MBit), on-board VGA and on-board Promise RAID.
- 512 MB Infineon DDR-SDRAM PC266, ECC, CL2 (this part seems to be out of production right now :(
- 2x80GB Maxtor Diamond Max Plus 6Y080P0, mirrored (RAID-1)
- Power consumption: approx. 120 Watt
Infineon, what else?

Should I mention that it is
?
ATTENTION: Don't run such a beast inside your room - it's simply too loud! The two radial blowers have a noise level similar to a vacuum cleaner.
Acknowledgements: This howto relies heavily on Kirby Menzel and Lucas Peet’s great Postfix+Courier-IMAP+MySQL for multiple domains HOWTO (kirb.insanegenius.net/postfix.html). In the underlying howto, I list the steps I’ve used for my own setup. It has an additional section about SMTP authentification using Cyrus SASL (and Mysql) and some more information not found in the above howto.
Required Software
- (FreeBSD 4.8)
- Postfix
- Courier-IMAP
- MySQL
- OpenSSL (SSL/TLS)
- Cyrus SASL v2
Installing Mysql Database
cd /usr/ports/databases/mysql40-server
make install BUILD_OPTIMIZED=yes
make clean
A group mysql and user mysql has been added automatically, as well as a startup script /usr/local/etc/rc.d/mysql-server.sh.
After starting the mysqld server, modify the password with:
/usr/local/bin/mysqladmin -u root password 'new-password'
When calling mysqladmin later, pass the -p option, which will ask you for the current password, otherwise the access will be denied.
TIP: use security/apg, the automated password generator, to create a password.
Installing postfix
cd /usr/ports/mail/postfix
make
(choose "Cyrus SASLv2", "SSL and TLS" and "MySQL map lookups" as options,
then in SASLv2 menu, choose "MySQL password Authentification")
make install
make clean
Note that this will additionally install the Mysql-323 client libraries (if you’ve installed MySQL 4.x).
Modify /etc/mail/mailer.conf to use Postfix, or let the install program do that for you (make install will ask you).
For automatic startup when system starts, disable sendmail in /etc/rc.conf:
sendmail_enable="NONE"
Then make the following symbolic link:
cd /usr/local/etc/rc.d
ln -s /usr/local/sbin/postfix postfix.sh
Important: The MySQL server should be started before postfix (rename mysql-server.sh into 100.mysql-server.sh or something alike; number go before characters).
Then create /etc/periodic.conf with following content:
daily_clean_hoststat_enable="NO"
daily_status_mail_rejects_enable="NO"
daily_status_include_submit_mailq="NO"
daily_submit_queuerun="NO"
This will disable some Sendmail-specific daily maintenance routines.
Check /etc/mail/aliases, then run newaliases, otherwise Postfix will not be able to deliver mail locally (unless you setup virtual domains etc.) Local mail will be usually delivered to /var/mail.
You can disable sending email locally (using sendmail) by setting the following option in main.cf.
mydestination =
But make sure you know what you are doing!
Setting up the Mysql tables
mysql -p -u root
create database maildb;
CREATE TABLE transport (
domain varchar(128) NOT NULL,
transport varchar(128) NOT NULL,
UNIQUE KEY domain (domain)
) TYPE=MyISAM;
CREATE TABLE users (
id varchar(128) NOT NULL,
address varchar(128) NOT NULL,
clear varchar(128) NOT NULL,
crypt varchar(128) NOT NULL,
name varchar(128) NOT NULL default '',
uid smallint(5) unsigned NOT NULL default 5000,
gid smallint(5) unsigned NOT NULL default 5000,
home varchar(128) NOT NULL,
domain varchar(128) NOT NULL,
maildir varchar(255) NOT NULL,
quota integer unsigned NOT NULL,
imapok tinyint(3) unsigned NOT NULL default '1',
PRIMARY KEY (id),
UNIQUE KEY id (id),
UNIQUE KEY address (address),
KEY id_2 (id),
KEY address_2 (address)
) TYPE=MyISAM;
CREATE TABLE virtual (
address varchar(255) NOT NULL,
goto varchar(255) NOT NULL,
UNIQUE KEY address (address)
) TYPE=MyISAM;
GRANT SELECT
ON maildb.*
TO maildb_user@localhost
IDENTIFIED BY '****chose a password here****'
;
The user maildb_user will be used by both Postfix and Courier.
Modify Postfix’s main.cf
Add these lines:
transport_maps=mysql:/usr/local/etc/postfix/mysql_transport.cf
virtual_mailbox_maps=mysql:/usr/local/etc/postfix/mysql_virtual_mbox.cf
virtual_uid_maps=mysql:/usr/local/etc/postfix/mysql_uids.cf
virtual_gid_maps=mysql:/usr/local/etc/postfix/mysql_gids.cf
virtual_mailbox_base=/var/spool/postfix/virtual/
virtual_maps=mysql:/usr/local/etc/postfix/mysql_virtual.cf
# 100 MB
virtual_mailbox_limit=102400000
virtual_minimum_uid=100
…and create the mysql_XXX.cf files:
# mysql_transport.cf
user=maildb_user
password=******
dbname=maildb
table=transport
select_field=transport
where_field=domain
hosts=localhost
# mysql_virtual_mbox.cf
user=maildb_user
password=*****
dbname=maildb
table=users
select_field=maildir
where_field=address
hosts=localhost
# mysql_uids.cf
user=maildb_user
password=*****
dbname=maildb
table=users
select_field=uid
where_field=address
hosts=localhost
# mysql_gids.cf
user=maildb_user
password=*****
dbname=maildb
table=users
select_field=gid
where_field=address
hosts=localhost
# mysql_virtual.cf
user=maildb_user
password=******
dbname=maildb
table=virtual
select_field=goto
where_field=address
hosts=localhost
Important: Don’t forget that the mysql_xxx.cf files should not be world readable (mode 400) and chown postfix:postfix them, due to the fact that they contain a mysql password, and the password allows to get the users password out of the table! Whenever you modify these files, make sure it is readable by postfix:postfix (e.g. if you check the file out of a RCS repository, beeing the root user, the file will be owned by root and not by postfix, so that the postfix daemon cannot read the file anymore!)
Also add $transport_maps to mydestination.
Finally Steps
Create /var/spool/postfix/virtual/ and chown it to user/group postfix. For each user, create a mailbox directory and chown it to them.
Filling Data into the Mysql tables
For creating the crytped password, use Mysql’s encrypt function.
A simple example (one domain, one user, multiple aliases):
mysql> select * from transport;
+----------+-----------+
| domain | transport |
+----------+-----------+
| ntecs.de | virtual: |
+----------+-----------+
1 row in set (0.01 sec)
mysql> select * from users;
+---------+------------------+-------+---------------+-----------------+------+------+...
| id | address | clear | crypt | name | uid | gid |
+---------+------------------+-------+---------------+-----------------+------+------+...
| michael | michael@ntecs.de | abc | 9GZRP/ADKguPk | Michael Neumann | 5000 | 5000 |
+---------+------------------+-------+---------------+-----------------+------+------+...
...+----------------------------+----------+-------------------+-----------+--------+
| home | domain | maildir | quota | imapok |
...+----------------------------+----------+-------------------+-----------+--------+
| /var/spool/postfix/virtual | ntecs.de | ntecs.de/michael/ | 102400000 | 1 |
...+----------------------------+----------+-------------------+-----------+--------+
1 row in set (0.00 sec)
mysql> select * from virtual;
+-------------------+------------------+
| address | goto |
+-------------------+------------------+
| mneumann@ntecs.de | michael@ntecs.de |
| @ntecs.de | michael@ntecs.de |
+-------------------+------------------+
2 rows in set (0.00 sec)
TODO: It might be a good idea to rename table users to mailbox, virtual to aliases and perhaps transport to domain.
NOTE: Each email that should be deliverable, must have an entry in the virtual table, even if it points to itself (e.g. mneumann@ntecs.de => mneumann@ntecs.de).
For more information about Postfix and Mysql read /usr/local/share/doc/postfix/MYSQL_README.
Courier Imap
cd /usr/ports/mail/courier-imap
make WITHOUT_PAM=yes WITH_MYSQL=yes
make install
After installation, we first create a certificate for imapd-ssl. First edit /usr/local/etc/courier-imap/imapd.cnf (copy from imapd.cnf.dist):
RANDFILE = /usr/local/share/courier-imap/imapd.rand
[ req ]
default_bits = 1024
encrypt_key = yes
distinguished_name = req_dn
x509_extensions = cert_type
prompt = no
[ req_dn ]
C=DE
L=Frankfurt
O=Courier Mail Server
OU=Automatically-generated IMAP SSL key
CN=localhost
emailAddress=postmaster@ntecs.de
[ cert_type ]
nsCertType = server
Edit the [ req_dn ] section as you like.
Then, create the certificate with:
/usr/local/share/courier-imap/mkimapdcert
Now we’ll copy the example configuration files to their real name:
cd /usr/local/etc/courier-imap
mv authdaemonrc.dist authdaemonrc
mv authmysqlrc.dist authmysqlrc
mv imapd.dist imapd
mv imapd-ssl.dist imapd-ssl
Edit authdaemonrc and remove all but "authmysql" from authmodulelist.
Edit authmysqlrc:
MYSQL_SERVER localhost
MYSQL_USERNAME maildb_user
MYSQL_PASSWORD *****
MYSQL_SOCKET /tmp/mysql.sock
MYSQL_DATABASE maildb
MYSQL_USER_TABLE users
MYSQL_CRYPT_PWFIELD crypt
MYSQL_UID_FIELD uid
MYSQL_GID_FIELD gid
MYSQL_LOGIN_FIELD id
MYSQL_HOME_FIELD home
MYSQL_NAME_FIELD name
MYSQL_MAILDIR_FIELD maildir
MYSQL_QUOTA_FIELD quota
MYSQL_WHERE_CLAUSE imapok=1
Edit imapd:
nothing to edit for now
In imapd-ssl there are two important options:
IMAPDSSLSTART=NO (enables imapd over SSL)
IMAPDSTARTTLS=YES (imap via TLS: TLS is application level encryption)
Both should be set to these values by default, so no editing is needed here.
Setting IMAPDSSLSTART=YES and IMAPDSTARTTLS=NO would work too, but TLS is newer, so we’ll use it here, and I believe it transmits unsensitive data in clear text (the connection establishing part, until password and data is transfered, then switchs to encryption). Again I think, TLS is easier to cope with in terms of logging etc.
Do some cleanup:
cd /usr/local/etc/rc.d
rm courier-imap-imapd.sh.sample (we don't need this)
rm courier-imap-pop3d.sh.sample (we don't need this either)
ln -s /usr/local/libexec/courier-imap/imapd-ssl.rc courier-imap-imapd-ssl.sh
And start the daemon (or reboot):
sh /usr/local/etc/rc.d/courier-imap-imapd-ssl.sh start
Postfix SMTP Setup
By default, everyone from everywhere (untrusted client) is allowed to send your postfix server mail he is responsible for (mydestination paramter, or relay_domains), whereas relaying to other hosts is not allowed.
Only allow relaying on the host itself by setting:
mynetworks_style = host
Remove saslauthd.sh from /usr/local/etc/rc.d as we don’t need it:
cd /usr/local/etc/rc.d
rm saslauthd.sh
In /usr/local/lib/sasl2 create file smtpd.conf:
# smtpd.conf
pwcheck_method:auxprop
mech_list: plain login
mysql_user: maildb_user
mysql_passwd: *******
mysql_hostnames: localhost
mysql_database: maildb
mysql_statement: select clear from users where id = '%u'
# mysql_verbose: 1
After everything works as expected, we can turn "mysql_verbose" off (comment it out).
Make this file chmod 400, as it contains the mysql database password.
Note that for SASL authentification we need the clear-text password (column clear), whereas for IMAP autentification the crypted-password (column crypt) is required. This is a bit strange.
Again, modify postfix’s main.cf configuration file (in /usr/local/etc/postfix):
smtpd_sasl_auth_enable = yes
smtpd_sasl_security_options = noanonymous
smtpd_sasl_local_domain =
broken_sasl_auth_clients = yes
smtpd_recipient_restrictions =
permit_mynetworks,
permit_sasl_authenticated,
reject_unknown_sender_domain,
reject_unauth_pipelining,
reject_unknown_recipient_domain,
reject_non_fqdn_sender,
reject_non_fqdn_recipient,
reject_non_fqdn_hostname,
check_relay_domains
Enable TLS/SSL
We need a certificate first. But as we still have one for the Courier-IMAP server, we’ll reuse that. We should give it 400 permissions first:
cd /usr/local/share/courier-imap
chmod 400 imapd.pem
Then again, we edit Postfix’s main.cf configuration file, where we add the following:
smtp_use_tls = yes
smtpd_use_tls = yes
smtp_tls_note_starttls_offer = yes
smtpd_tls_key_file = /usr/local/share/courier-imap/imapd.pem
smtpd_tls_cert_file = /usr/local/share/courier-imap/imapd.pem
smtpd_tls_CAfile = /usr/local/share/courier-imap/imapd.pem
smtpd_tls_loglevel = 1
smtpd_tls_received_header = yes
smtpd_tls_session_cache_timeout = 3600s
tls_random_source = dev:/dev/urandom
# allow authentification (e.g. PLAIN/LOGIN) only in TLS mode
smtpd_tls_auth_only = yes
The last directive (smtpd_tls_auth_only) is very important, as it only allows authentification to be performed when TLS is established. This protects the client, to accidentially send their passwords in clear text.
Todo’s
- run Postfix in chroot’ed environment.
- run Courier-Imap as non-root user
- change /tmp/mysql.sock to /var/mysql/mysql.sock
- use mail/messagewall to protect even more against spam
Window-Manager
I’ve tried a whole bunch of different window managers (or desktop environments): KDE, Gnome, WindowMaker, AfterStep, Enlightenment, MetaCity, SawFish, Xfce, ROX, IceWm and some others I do not remember anymore.ROX Desktop
ROX (rox.sf.net) is one of the best I’ve ever used, if it wouldn’t be so buggy (ROX-Filer faults on showing large directories). It’s very easy to use, looks nice and has a great file explorer (ROX-Filer).
IceWm
IceWm (icewm.sf.net) I even prefer more. I’ve used it years ago before trying all the other window managers out. Now, I’ve finally switched back using it.
IceWm is small, solid and rocking stable (I’ve never ever seen it crashing). It’s themeable, easyly customizable through simple configuration files and has multiple virtual desktops. But most important is that the windows behave as I think they should behave (maximizing etc.. all very Window-ish), no exotic behaviour.
Mail-Clients
As with window managers, I’ve tried a lot of different mail programs. On Unix, I started with Pine, then switched to mutt, then tried a lot of GUI mail readers, e.g. KMail, Sylpheed, Balsa, Mozilla Mail or Evolution.
Evolution
Evolution is very similar to Outlook, everything is included (Groupware). It looks very nice and usually responds quickly (unlike Mozilla), but seems to hang very often on my computer.
Mozilla Mail
Next best of GUI-clients is Mozilla-Mail, but it’s a bit slow. I’ve used it instead of Evolution, but wasn’t happy with it, either.
Mutt
I’ve removed both of them, and installed mutt again. The reason why I didn’t used mutt a long time was that no SMTP client is included, so you have to use sendmail or something similar (for the solution of this problem see my article at www.ntecs.de/blog/Tech/UNIX/Mail).
There’s nothing you cannot customize in mutt. Marcros, custom keybindings and more. Furthermore, mutt is extremely fast and rocking stable. Should I mention it’s text-mode?
Editors
Vim
Vim is great once you’ve learned it. It’s like learning machine-typing. Once you’ve got it, you’re incredible fast. Vim’s syntax high-lighting is very good and supports nearly every language (currenlty more than 300).
I tried Emacs, but as I often need to start an editor from command line, it’s startup takes too long. Emacs is a real best, everything is included. I don’t like that, but that’s my personal opinion. Emacs is a very good editor!
WWW Browsers
I currently use Opera and Mozilla Firebird (Phoenix) as well as the text-mode browsers elinks and w3m (with image support). Opera has good bookmark support but dumps core every so often.
Shells
I use zsh (or the default tcsh as root on FreeBSD).
Versioning/Configuration Managament
CVS
I avoid it’s usage whenever I can. It was good as long as no other, better, tools existed. But it’s a mess when renaming files, and there are many other historical mistakes.
arch
This project seems to be canceled.
aegis
Very nice tool, but it imposes a process on the programmer, so it’s not as lightweight as subversion, OpenCM or CVS.
subversion
Very good program! It’s the next generation CVS. I’ve used it since rev. 1868 (?), but now I’ve dropped it in favour of OpenCM.
I found it very hard to compile subversion myself (without using the FreeBSD port), as it depends on several libraries (neon, apr).
OpenCM
I use it for all my new projects. It’s commands are similar to CVS (even simpler) and has all the features subversion provides (except WebDAV), plus some more (further versions will reintroduce support for replication). OpenCM makes intensive use of cryptography, integrity of objects is ensured (=> high-assurance).
OpenCM can be used to version more generally any kind of object, not just files. The server side do not even know about files. It’s the client tool that maps file names and attributes to objects.
OpenCM is developed by the developers of EROS, the Extreme Reliable Operating System (www.eros-os.org).
Operating Systems
I’ve used in this order: TOS (ATARI), DOS, Windows (3.x, 95, 98), Linux (Suse 6.0), Windows, NetBSD, NetBSD, NetBSD, FreeBSD. Of course I’ve also tried Win2000/XP but never really used them.
Furthermore, I’ve evaluated QNX, Plan9 and EROS as well as some minor ones.
Linux
Maybe Suse’s fault, but Linux never made me excited. Linux is no operation system at all, it’s a kernel, but that’s another story (I don’t try to flame).
NetBSD
This is my favorit! Its code base is of a very high quality, there’s no featuritis and no hype. And of course, it runs!
FreeBSD
I am currently running FreeBSD 5.0 on my home computer and my server. It’s generally more up-to-date than NetBSD (it has more ports) and that’s why I do not currently use NetBSD. But there’s no huge difference. I like all BSDs!
Plan 9
The Unix-successor from Bell Labs. It has features not found in Unix, e.g. namespaces or no root user. Almost everything is a file in Plan 9, even the Window Manager (8 1/2) is controlled by simple file I/O.
It’s free, but has a strange license. Hardware support is limited.
EROS
EROS, the Extreme Reliable Operating System, is based on capabilities. Everything down to sectors on disk or single memory pages are controlled by capabilites (non-guessable random numbers). It’s security can be proven by mathematics.
It’s processes are persitent without interaction by the programmer. Every 5 minutes, the processes are check-pointed and written to disk. Thus, EROS do not need a file-system (and does not have one).
EROS is not (yet) usable for users and will probably never be! It’s for high-reliability and security.
The commands below did the job (substitute user, password, hostname etc. appropriatly):
mkdir local-dir
ftpcopy -n -u user -p password hostname / local-dir
The slash (/) specifies in this case the remote directory. Option -n tells ftpcopy to not delete the files from the FTP server after they have been copied.
ScriptAlias /blog "/home/mneumann/cm/my_rublog/index.cgi"
When using virtual hosting, make sure the following directive is inside your <VirtualHost>:
AddHandler cgi-script .cgi
Replace /home/mneumann/cm/my_rublog/index.cgi with the path to your RubLog CGI script.
Say you want to redirect every path below /proxy on your main webserver to another webserver listening at localhost:8080. All you have to do is to modify your httpd.conf file by adding the following lines:
LoadModule proxy_module libexec/apache2/mod_proxy.so
LoadModule proxy_http_module libexec/apache2/mod_proxy_http.so
<Location /proxy>
ProxyPass http://localhost:8080
</Location>
Finally restart the main webserver (apachectl restart).
- ci file
- check file into repository and remove it.
- co file
- checkout file (file will be readonly, so you cannot modify it).
- co -l file
- checkout file and lock it (you need to lock the file prior to modifying it).
- ci -u file
- equal to ci file && co file.
- ci -l file
- equal to ci file && co -l file.
So, to modify a file already under version control, call co -l file. This will check it out in locked mode. After modifying it, use ci -u file to commit your changes and put a copy into the current directory, so that it can be read by daemons or processes.
With a file not yet under version control, proceed as follows:
- If there’s not yet a RCS directory (in the same directory as the file), create one.
- Put the (unmodified) file under version control by issuing ci -l file.
- Modify it.
- Commit your changes and check the file out, in non-locked mode: ci -u file.
So here is a tip to prevent locking yourself out:
At first edit /etc/rc.conf and disable the firewall by setting firewall_enable="NO" (do not forget to reverse this step later).
Next, open up two ssh sessions and become root user. Now, before you change your rules, type at the other terminal:
sleep 100 && reboot
Then apply the firewall rules (ipfw flush && ipfw /etc/ipfw.rules). If you’ve not locked yourself out, you can simply abort the "software watchdog timer" by typing Ctrl-C, whereas in the case you’ve locked yourself out, the computer will reboot after 100 seconds and as we’ve disabled the firewall in rc.conf, after reboot it will be open up for you again.
Security Announchements
Subscribe to FreeBSD-security-notifications@FreeBSD.orgMake System Files Unchangable
Make kernel and /bin unchangeable:
chflags schg /kernel
chflags -R schg /bin
chflags -R schg /modules
To undo use noschg instead schg.
Even root cannot delete / modify them (when in securelevel >= 2). So be careful to not make your rc.conf files unchangable, unless you exactly know what you do.
To display file flags use ls -lo.
Securelevels
Level -1:
no restrictions
Level 1:
- Cannot load / unload kernel modules
- Disabled /dev/mem etc.
- no access to raw devices
- no X windows
Level 2:
Same as level 1 plus the following:
- cannot write diretly to mounted / unmounted filesystems
- cannot alter system time by more than 1 second
Level 3
Same as level 2 plus the following:
- cannot modify ipfw rules.
Conclusion
As long as you’re not modifying your firewall rules very often, run Securelevel 3, otherwise go with Securelevel 2. On my private co-location server, I’m running Securelevel 2, as I need to modify firewall rules from time to time (e.g. enable another port for a user).
