...more recent posts
I remember the last time I made a large change to the system. About a year ago. The worst part was translating all the old posts into the new database. This time, thankfully, it's not seeming so bad. Maybe I didn't make as many changes.
Well, truthfully, I don't have the comments importing yet, so there may still be problems. But I just wrote something to grab all the posts on my page from the remote server ("oak") and put them into the new database here on my desk ("tulip".) I takes about a minute over this 56k dialup to suck it all down. Not bad.
I'm probably not as close to being done as it seems from the news that I'm already importing the old database. This is just for some testing so I can make sure I'm not missing any obvious problems just because I don't have many posts in the system.
The biggest finding so far, which is obvious but still worth noting, is that everything works really fast when it's all on the same machine. Wow. What a pleasure to not have to wait even one second for anything to happen. This makes the on line experience feel rather lacking in zip. I wonder how much of a difference a dedicated server will make?
Did some good work yesterday, and over the weekend, after last Friday's failing. Seems I need some down time after every few productive days. It feels like I'm not doing anything, but maybe that's not actually the case. It's interesting the way the mind works on problems.
It used to be that I slept very deeply straight through the night. Every time. Now that hardly ever happens. Lately my pattern has been (as best I can make out) to sleep for a few hours. Then I wake up after a first round of dreams. Lay in bed for a while unable to get back to sleep. After about 30 seconds my mind starts working on code again. I can't help it so I just let it go. Eventually I fall back asleep, and then after another dreaming episode I wake up again and start back in on the code. I think I had about four cycles last night. This is pushing back my wakeup time a bit in the morning. But on the other hand, I'm actually working during the night.
Yesterday evening I was sitting at the bar at aKa having a glass of wine, waiting for MB, and writing notes to myself about things still needing to be done in the new system. And then it hit me. Not only could I move comment page directory entries out of the main directory table, but I could just get rid of them all together. Without losing any functionality! It's easy to make huge breakthroughs when your original idea was so far off. Here are some numbers concerning this site which might illustrate why I'm so excited about this:
There are 4,379 entries in the directory table. These entries correspond to the 4,379 pages on this site.
But under my new system I'll only need 231 entries to categorize the same information (most of those 4,379 pages are threaded comment pages which, it turns out, don't really need to be in the directory.)
The subscription table is even worse. With one entry for every users subscription to every page we presently have 144,507 entries in the table (33 users X 4,379 pages.)
In the redesigned system this number would be 7,623. I will have to add one column to the table to make this happen, but still it's a major savings.
These are some startling results. We'll see if I can make it work. If so, then my fears about hitting the ceiling for how large this system could scale have been pushed back quite a bit. I'm going to try to get it minimally working today.
H. ordered one of the brand new (well, O.K., "speed bumped") dual 1Ghz G4 Powermacs today. 5 days ship time. I'm very curious how 10.1 feels on that compared to a 500mhz G3. Little faster I'd guess.
I'm pleased to report that I am just now concluding the work day having done precisely nothing. Starting out this morning I had hopes of perhaps accomplishing very little, but by late afternoon I began to sense the possibility for much much more. At 5:00 I had a brush with disaster - almost opening up the text editor to look at some code - but a quick trip to slashdot took care of that errant impulse. And from there it was smooth sailing. Zeros across the board. A perfect game. I think congratulations are in order.
Still in deep. Things are going well though (or at least they seem to be, given my lack of any formal testing ability.) Here's a few links I came across today while looking for other things:
Rebecca Blood (of Rebecca's Pocket) has a book coming out titled The Weblog Handbook.
Scoble reports on a brief demonstration of blogger pro ($5/month.)
Techno-weenie has posted an RFC on creating a "common XML-RPC API for weblogs." I'd support something like this but I wonder how useful it will be. The best reason to have so many different blogging (or knowledge management, or whatever) products is that they all work a little bit differently. I might try to support a common API, but not if I had to give up some of the stranger features I have built into my system. And if the structures of different systems are too different then the XML-RPC is going to be a mess. Or, in other words, the problem is here:
option (struct) -- extra weblog-specific options passed in the format parameter_name => parameter_value. Ideally, the options should be totally optional and the XML-RPC methods should function correctly without them.Ideally maybe, but at least in my case, the option struct would have to contain lots of crucial information (given the rest of the structure presented.)
Two pages of pictures of 76 Clinton St. progress.
Well, I'm sure you're dying for an update.
The new system (running locally on my production server) is almost usable. Note that this does not mean it is close to being done. At this point I can:
create pages
edit pages
view pages
post
add comments
edit posts and comments
None of these are tested very well. I'm still trying to figure a way to write a tool that will help me test (basically a tool that can simulate many different posts and edits from many different simulated users.) Right now in order to test I make a bunch of different accounts, sign in as different people, make posts, sign out, and then sign in as someone else and see if everything is OK. This takes a lot of time, and it is not very thorough. That's why everyone here has discovered so many bugs.
In addition to that testing tool (which I may not be able to write) I still have to enable the following:
admin
subscribe
upload
user
index
These should be easier than what I've done already. Post, edit, and the main script that draws the pages, as well as the one that draws comment pages, are where most of the changes are.
Then, when that is all done I have to write something to transfer the old database (I'm going to start with a different site than this one as the test, but it has the same database structure as this site so I should only have to write the translator once) into the new format. This is always tricky.
Could be done with just a few more full days work if I don't hit any snags.
Up early this morning. I made MB breakfast because I feel bad about how hard she is working on the new restaurant. I worked hard yesterday too, but it's different. We had a great dinner at Bond St. last night where I talked maybe too loudly about the dangers of "property rights fundamentalism" (as Lessig recently turned the phrase.) Actually I kept trying to drop the whole thing, but nonetheless conversation continued to circle back. All I was saying was that no sort of content protection will ever work as long as people have access to general purpose computers. But apparently that led me into some sort of dystopian fantasyland. Still, I'm not sure I'm so far off. There is a lot of content, and a lot of powerful people who own that content and want it to be secured. And if it's true (trust me, it's true) that there is no way to secure it while people have access to general purpose computers, then it stands to reason that those same powerful forces will try to go the legislative route and outlaw general purpose computers. My guess is they probably won't win, but I think it's pretty clear they are going to give it a shot. Notice that Valenti recently called people sharing (pirated) video content "terrorists" (last week in the NY Times, but I don't have a link.) That seems pretty calculated to me.
Maybe this stuff is already in the Patriot Act, I don't think anybody has even read that whole thing yet. We'll see.
Hopefully I can get another long day in today. I have to make some important structural decisions. This is where having been through the problem a few times before really helps. It sharpens your intuition. Or let's hope...
More mozilla strangeness: I can't set a cookie for mozilla 0.9.7 (OS X) from my local server (127.0.0.1)
IE 5 has no problem.
Getting some good work done today (finally!)
My friend J. convinced me that rewriting my code base wasn't necessarily such a bad idea. Especially if I've done it less than three times. Indeed, this will be the third time, so let's see if it's a charm like he suggests.
I'm not doing a complete from scratch rewrite. But I am going through every line of every script. Plus I have changed the database structure slightly to get rid of the scaling problem that threaded comments was going to cause.
I'm trying to add lots of documentation in-line this time. And of course clean up all the glaringly crappy code (at least to a regular crappy level.) But the real focus is on making the whole thing much easier to move from machine to machine. That means abstracting all the machine specific information out of the main code into configuration files. I'm also trying to abstract out as much HTML code as possible so that if I ever come to really care about standard compliance I can fix my good-enough HTML without going into every single script.
This new improved code base will be running on my home (production) server. When it stabilizes I will get the new colo server and move this new code to that machine. Then I'll move some other people to that machine and see what happens. Then if everything seems good I'll move digitalmediatree to the new machine and run things in parallel for awhile. Then finally I'll close this account and move digitalmediatree.com to the new server. Probably that is not too close to happening.
I haven't fully figured this out yet, but I will also try along the way to write something that will sync the contents of the production server database with the digitalmediatree database. I've never done anything like that before. I don't think it will be easy, but I think I can do it. If that works then I will provide a local copy of the system to anyone here who is running OS X. That might turn out to be pretty cool.