...more recent posts
Lighttpd 1.4.0 released as promised, just in time for me to try it out on the new server. Excellent. I am excited about this webserver.
I guess I'll call the new software 'datamantic' and hope that doesn't cause confusion with the site name.
I finished the main part of the installation program yesterday. It's not quite one click, but it's very easy. I still need to ssh into the account to set some directory permissions, but that's not a big deal. I guess I can make a shell script to take care of that (PHP can't do it because it doesn't run as a user and therefore doesn't have access to the file system outside the web root - and that's where I need to change some permissions.)
This morning I've tackled the last remaining big missing piece in the datamantic software. It's a little embarassing, because it's so obvious in hindsight that it needed this capability, but until now it couldn't handle floating point numbers. Items (and the metadata atoms of each item) can, in theory, be anything you want. That's the whole point of the system: flexibility. Except they couldn't be floating point numbers, only integers. Obviously for a business system we are going to often need atoms to hold monetary values. And unless we are okay with excluding cents (and of course we're not okay with that,) we need some float support.
And now it's in there, at least for adding new information. I still need to bring the edit script up to speed, but that shouldn't be too hard.
I really want to get this done today so that Monday I can shift focus away from the software and start back to dealing with the hardware side of things.
I still have never said anything explanatory about the new software. And it's not that I haven't tried. Explaining software is difficult. So I'm still hoping for a longer, more thorough post, but here's something shorter about what I am working on presently.
The software is basically done. This is what I wrote several months ago during the initial phase of my long absence from blogging. It is a descendant of the software that runs this site, but much more generalized in an effort to target small business websites (especially inventory-centric websites,) instead of just blogging. Where this site has posts and then a bunch of meta-data associated with each post (author's name, date, summary, comments, etc...) the new software allows you to define anything as the main post - these are called 'Items' in the new system - and then associate any number and kind of metadata 'Atoms' with each item. All items, and their associated atoms, can be specified through the same sort of browser based interface that I use everywhere.
So, in other words, this site works great for blogging, but not for much else. The new software could be set up to blog, but it can also handle the situation where, say, instead of blogging you want to have a website for your wine business. Now instead of blog posts you have bottles of wine as your main items. Each wine then has a bunch of metadata associated with it (grape types, producer, year, etc....) To do this before would require me to write a lot of code. Now I can do it all through a web interface.
I wish I could explain it better because it's pretty cool.
You specify what your item and it's atoms are going to be like. Text, binary data, numbers? Then the software asks you a few questions about your items and it's atoms (like, what sort of form elements will be necessary for adding and editing,) and it builds all the necessary user interface pieces for you. I'm not sure how clear that is, but that's the slick part. The posting and editing scripts are very generalized and they can deal with any data, no matter how it is structured.
The whole project was a matter of abstracting the software that runs this site. Boiling it down, but at the same time radically expanding it's scope. Freeing it from the conceptual mold of blogging, while maintaining the blog like ease of creating and editing right in the web browser.
And, like I said, it's largely done. It's already deployed behind a few sites that I will point to eventually. But how the sites look isn't really the main thing. It's how they run. It's how (hopefully) easy they are to build and maintain. That's the key.
So now I am working on polishing the whole package. This is something I've never done before. I have put the older software behind a number of different blogging type sites (and even tried to adapt it for several business sites,) but it is a monsterous pain in the ass to deploy. It takes me the better part of a day to set it up, and that's assuming everything goes right. There aren't any instructions (which even I need and I wrote the thing,) it's very unintuitive, and the whole thing is just a sprawling mass of weird hacks and dependencies. It runs well once you get it working, but it gives off a sort of "don't even breathe on it" vibe.
So I'm trying to fix that this time around. Yesterday, for instance, I wrote a program that automatically sets up the database for a new installation (of the new software.) If you can believe it, I've never had an automated way to do this. I would just fire up mysql in an ssh session and create all the tables by hand. That's fine if your just deploying one site, but my goal now is to be able to deploy lots of sites. Quickly and painlessly. So that means building the automated tools to do it.
I thought I could do it in a few hours yesterday morning. How hard could it be? Ended up taking 10 hours. And 6,000 lines of code (not that number of lines means anything - a better programmer could have done it in less.) But I now have an automated way to set up the rather complex database schema. And not just that, but I can also run it against an existing database and it will analyze every table and fix anything that is not right. This includes modifying create statements on table columns that are not formed correctly, adding missing table columns, as well as adding completely missing tables.
This brings us the the key point of my recent efforts. Maintainability. This is what I learned when I tried to put the old software behind more than one site. The basic problem is that I would make a fix to one installation, but then that fix would not get propagated to the other sites. So they quickly fell out of sync with each other, and then each one had to be maintained as it's own independent entity. As they say in the biz: this doesn't scale.
So the new mantra is centralization. And the database creation/updater program I wrote yesterday is the first part of that. If I need to make a change to the database now - say, I realize that the user table needs another column to store a contact phone number of users - I create that new column in the program I wrote yesterday, and then I run that program across *all* installations of the software. This will keep everything in sync. Building tools to do what you previously did by hand allows you to scale.
Next up is constructing a similar tool to keep the code in sync across installations. The goal is to make changes in one place (say, on the test server on my development machine,) and then when things are running correctly I want to be able to run one script, and have all the changes to both the database structure and the code itself be replicated across all installations of the software. In other words, as a one person business, I'm trying to make myself scale.
Updates have been sparse, to say the least. I'm just back from some time with my family on Cape Cod. But now it's August, and August is the month for me to get busy. So hopefully there will start to be some real progress, although I guess I have said this before.
The corporation is all set. I've got my little kit from the state. Wow, I didn't realize how much like a game it is. The corporate seal hand puncher thing (looks sort of like a hole punch, except instead of making a hole it creates a raised round corporate seal on a piece of paper,) is pretty cool. I almost expected there to be a secret decoder ring as well.
Now I just have to get the bank account (evidently the only time I will probably ever use the corporate seal,) and I'm ready to start buying hardware.
My thinking has changed a little bit on how to attack the problem, but the specific hardware and software choices have not changed too much. More on that soon.
To the people here who have contributed money, I apologize for the so far rather slow pace. But like I said, things really should pick up now.
Finally got the last of my missing financial documents that are needed to make the new company happen, which is in turn needed to actually purchase the server that I still haven't completely decided on yet. While it may not sound like it, this is a big step. Or, in other words, if you don't do very much moving, any step is a big step. Onward!
My plan was for this next post to explain a little bit about the new software I wrote during my recent break from blogging. It is for building websites. It shares some roots with the software behind this site, except it is more tailored to maintaining large stores of structured data (inventories) then to blogging. And it does one really clever thing I haven't seen before (although, okay, I haven't really examined every other piece of software in this category.)
But instead I want to quickly outline a piece of software that I haven't written, and which is, most likely, beyond my ability to write. Still, sometimes just being able to articulate an idea is a big step.
This post will be the "which server should I buy?" thread. I plan on doing this with a handful of central questions so that I can return with comments as time goes on. Maybe if it gets too long I will then create a "2nd which server should I buy" thread. These big question threads will be linked from the right hand navigation column.
I don't really expect anyone to be interested in all this. But if someone is, that is great, and if anyone can contribute anything to the discussion that would be even better.
Here goes: Which server should I buy?
Given that I can pretty specifically say what the server is going to be used for, I think there should be a fairly definite answer to the question. I just don't know what it is yet.
Here's what it will be used for: web serving. Most likely using Apache (although I guess you have to at least give a look at Lighttp with all the attention it's been getting lately.) Almost all requests will be to PHP scripts generating dynamic web pages by pulling data out of a MySQL database. Additionally there will be a rather large +1 TB store of ~5MB binary files that will be served straight from the file system over HTTP to a limited number of simultaneous connections (I don't need this to scale very high.) So that's all very basic web server stuff. [The reasoning behind this architecture and the various possible debates here will be a different post.]
I was initially very attracted to Apple's Xserves because the Mac OS X is what I know best (and what I build things on locally even though they get deployed on linux.) Plus Apple, and the Apple community, seem a little more friendly in my particular situation which is something like: I don't mind learning a little and even mucking around on the command line, but it's really not a goal of mine to be a sysadmin, so if Apple can supply me the whole widget, with a nice clean way to automatically download and install binaries, I can just worry about Apache, PHP and MySQL (what I like to do,) and not so much about, say, getting non standard ethernet drivers to compile under linux, or trying to set up DNS without a GUI. In fact, I don't mind paying a little more for someone else (Apple) to make these things easy for me.
Upon further research, however, it seems there are some serious performance questions (they may not actually be problems, but they are certainly questions right now) concerning MySQL. And maybe even Apache as well. Ouch. That's exactly what I want to do. OS X Server and the G5 chip (IBM's 970) are amazing at a whole host of tasks. Unfortunately it seems like the exact thing I need to do isn't one of them.
So while I haven't made my final decision yet, I feel pretty sure -again given specifically what I want to do - that Linux is the OS you are "supposed" to use. This basically means that the programs I need to run are built and optimized with the linux platform in mind. On the other hand, even if some of the more outrageous claims are true, and MySQL and Apache performance really are an order of magnitude slower on OS X, it might be the case that it is still "good enough". I'm not building ebay here. I think we run on a 700 mhz Pentium right now and I think performance is acceptable. (On the other other hand, I want room to grow.... )
So OS X Server vs. Linux is one debate. And then if Linux wins that debate then there is the secondary "which distribution?" question.
I'll get into specific configurations and pricing in the comments.
Even though the rumors were flowing over the weekend I was still stunned at yesterday's news that Apple is dropping IBM (and their G5 processor,) and beginning a 2 year transition to Intel x86 chips.
According to Jobs, IBM couldn't deliver the speed, and more importantly couldn't deliver the speed at low power that Apple needs to make the kind of small form factor very quiet machines it loves to make.
Here are a few of my initial thoughts:
For the average end user this makes little real difference. The Mac experience is primarily the experience of the Mac OS, and that isn't going to change.
Most present applications will run unmodified on the new machines thanks to a software emulation layer. Apple is very good at this sort of thing. Still, it's clear they are hoping that developers will do a little bit of work to recompile their apps to take full advantage of the new architecture. Adobe has announced full support which is very important to the Mac community.
You won't be able to buy OS X and run it on a Dell (or any other generic x86 machine.) The Mac OS X will continue to run exclusively on Apple hardware.
Although they won't be officially supporting it, Apple VP Phil Schiller stated that they won't do anything technical to preclude you from running Windows (or, one presumes, Linux) on the new Apple hardware. This might have some interesting benefits for Apple's "switcher" efforts. Now a windows person can buy a Mac and have the ability to switch back to Windows if they don't like it.
This moves seems to confirm that the Cell processor (variations of which will drive the Sony Playstation III and the new Microsoft XBox,) is not a viable desktop processor (or else, presumably, Apple would have stayed with IBM and that future.)
There could be something more here than meets the eye. It is at least possible that the switch to Intel has something to do with Hollywood and DRM. We know there is DRM in these new Intel chips. So possibly Jobs is trying to work out something like the iTunes music store for movies with Hollywood, and they simply won't do it unless it runs on these Intel chips. This is pure speculation at this point, but maybe something to keep in mind.
I shudder to think of what this is going to do to sales of present Macintosh computers, especially going into the Christmas season when the first of the new machines will be right around the corner (shipping early 2006.)
For me personally this greatly complicates the already complicated decision I need to make regarding my next server. I can't wait for the new machines (which I'd love to do since being able to wipe the Mac OS and install Linux on x86 is exactly the fall back position I would be most happy with.) But do I really want to drop (for me) a huge amount of money on a last generation G5 server? Is that really a machine I will be happy with in 4 or 5 years. My provisional answer is no, which would cause me to just buy an x86 now and run Linux.
But like I said, I don't think this is really a big deal for the average user. The software is going to stay the same. The Mac will still be the Mac even with Intel inside.
Hello again. Long time no blog.
I didn't really plan on taking a hiatus, but sometimes these things happen. I am going to try to resume a normal blogging schedule now. We'll see how it goes.
I've been here, in NYC, but not really getting out much. I have been doing a lot of coding. More on that in following posts.
I am also about to buy a new server and move all the sites from the computer in California to the new server which will be colocated here in NYC. There are some difficult decisions surrounding this purchase, so more on that in following posts as well.
And, lastly, I am also incorporating a company that hopes to use the new system I have just built, plus the new server I am about to buy, to create a business.
This blog will be details and notes from that effort. Probably it will be pretty similar to the stuff I was blogging about before, although hopefully the business quest will give it a little more focus.
I'm trying to get more of that.
Okay, more soon. Nice to be back.
32 hours left...