...more recent posts
Notes for myself (so I remember when setting up the real server.)
Install minimal system from disc, then:
rpm --import /usr/share/doc/centos-release-4/RPM-GPG-KEY
yum upgrade
yum install perl-DBD-MySQL mysql-server mysql php-mysql mod_auth_mysql openssl-devel openssl mod_ssl php-devel php httpd rpm-build rpm-devel gcc perl-CPAN autoconf automake
Make sure apache and mysql start on every reboot:
chkconfig httpd on
chkconfig mysqld on
And start them both now (since we don't want to reboot):
/etc/init.d/mysqld start
/etc/init.d/httpd start
Finally have a local network set up here behind the cable modem (I had just been switching the one ethernet cable back and forth between machines depending on what I was working on.) Wireless too, although that is just a little bonus. Anyway, now I can test my server setup more easily, using the laptop as a client. Quiet weekend here, so hopefully I can get a lot done.
I found this ridiculously helpful site of Johnny Hughes which has a bunch of tutorials for setting up a CentOS server. Johnny Hughes is one of the principal maintainers of CentOS. It's very cool that he seems to care so much about helping newbies like myself.
With support like that it really is pretty easy. To set it up at least. I guess the problem is just if something goes wrong. Do you know enough then to fix it? To not lose data? To get back up again quickly? I'm trying to learn as much as I can at this point so as to move the answers to those questions towards yes if anything ever does go wrong.
The last non technical hurdles have been cleared. It has been a strange experience interfacing with the government and the banking industry. These are things I do not have a lot of experience with, and so I was very nervous and kept putting everything off. Turns out it's all pretty easy. Go figure.
Also, I finally have a desk set up at World HQ, so I am no longer sitting on the couch and trying to work. That ought to triple my productivity right there (wouldn't take much, but hey....) Still need a chair though.
Was looking at servers again last night, and of course now that I have waited and thought about it for a while I am tempted to upgrade to the next level. But it's hard to figure. Seems stupid to buy something and then run out of storage right away. On the other hand there isn't much point in buying storage ahead of time since the prices keep dropping so rapidly.
Someone left a slightly larger and much nicer Dell Trinitron monitor in the world headquarters' garbage area. Thanks! I'm taking it as a good sign (after just taking it.)
I am really enjoying learning so much more about linux. What an amazing thing. Has anything else ever been built that is so vast, and at the same time so transparent? The more I learn the more it boggles my mind. All the information you need to learn about the system is an integral part of the system itself.
Every command has a corresponding manual page. You read these manual pages from the command line by typing 'man [command]'. So to read the manual page for, say, the command ifconfig you just type 'man ifconfig' and it spits out a page detailing the proper syntax for this command, a brief description of what it does, and a list of all options. And everything has a man page.
Of course, if you're rather new to the whole thing like I am, and you really don't know what a command does, the man page might be a bit terse. No problem though, that's what google is for. The amount of information is just staggering. Sharing information about the system is built into the social fabric of the community much the way man pages are built into the fabric of the system.
It's very cool. But that's not to say it's easy. People who know how to do things tend to answer questions in a way that puts you on the right track, rather than just telling you exactly how to do it. It's a "teach a man to fish" philosophy. It's not always what the newbies think they want (myself included when I am really stuck,) but it is a great way to learn. It forces you to learn.
But even beyond man pages and google (and mailing lists which I will talk about later,) more serious adepts have the best learning tool of all - the actual code itself. If you don't know why a certain command option isn't working like you thought it would, and the man page is no help, and you can't find the answer in google, you can just open up the source code for that command and start reading. Even if you don't understand the code itself, it will be heavily marked up with human readable comments.
I guess this is the ultimate source of the transparency. Good programmers document their code as they go so that other people, without access to the original programmers mind, can look at the source code and understand it. Documentation is not an after thought, but is, like I said before, an integral part of the thing itself. And this philosophy extends from the source code ground up.
You can drive a car your whole life and never understand how a carburetor works; but if you administer a computer running linux you are going to eventually get a sense of how the internals work. You almost have no choice. Because of the transparency, learning how to do things is the same as learning how things work.
To me this is fascinating, but it also means that there is a lot to digest. Here's an example. I am beginning to think very concretely about how to organize the file system on the future server. Like many things in this world, there is no "right" way to set it up. Linux is very flexible which allows you to do almost anything, including shooting yourself in the foot in an almost infinite number of ways. In order to insure I don't shoot myself in my foot - *for my given situation* - no one can give me a foolproof recipe for success.
Instead, the people who have the knowledge tend to lay out how they do it, and more importantly *why* they do it that way. They explain the underlying considerations that made them choose a certain path, and by elucidating those underlying conditions (by teaching you about how it works at a more fundamental level,) you can then come to a conclusion for your given situation.
This, again, is the transparency. There are no right answers, except to explain how things work on the next level down. So when looking for answers you are quickly sucked many levels deeper than you might have originally thought you needed to go. And hence you end up learning a lot.
Here is where I ended up in researching file system layout. I don't really need to know all that (nor do I even begin to understand all that!) What I am trying to do is not overly mission critical (lives aren't going to be hanging in the balance,) and it is also not going to have to scale very much, nor will it really be in danger of taxing modern computer hardware. But still I am reading stuff like this, and slowly beginning to get a fuzzy picture of these deeper levels. And that seems to be the linux way. It's turtles all the way down.
Less abstractly (sorry to make you slog through all that,) I could imagine the test server going to the colo here in NYC at the beginning of week of the 18th (about a week from now.) And then the new server will follow quickly after that.
The old Penguin server has been pulled out of deep storage, I have acquired a very old Viewsonic 14 inch monitor, and they are set up and ready to go at the new secret Datamantic world headquarters. I am now waiting for my Powerbook to burn disc 1 of CentOS 4.1 i386 (that is a specific distribution and flavor of Linux,) so I can load it into the server and begin to configure this thing.
It is exciting and also a little scary. Like walking around in the dark. I really don't know what is going to happen. I've been studying the very active CentOS mailinglist for the last few weeks and there seems to be an awful lot of community support around this distribution. Hopefully that will be enough to get me up and running.
For the record, I am most scared of Bind, followed by email services.
Here goes...
Down on the Jersey shore today. Spent an enjoyable day doing some leisurely coding. I am now very close to realizing the one click web site setup idea. I load one webpage on my local machine and it asks for a a few pieces of information about the remote server (address, username, password,) and then it creates the directory structure, uploads all the php files, and sets up the database.
All I have left is to include an option to transfer the contents of the local database to the database on the remote server.
This way I can develop new sites on my local machine, and then when everything is ready, I can move the site to the remote (real) server with one click.
Not very difficult, but I am very happy that I have it set up this way. It is taking longer than I thought, but this will save a *lot* of time down the road.
Lighttpd 1.4.0 released as promised, just in time for me to try it out on the new server. Excellent. I am excited about this webserver.
I guess I'll call the new software 'datamantic' and hope that doesn't cause confusion with the site name.
I finished the main part of the installation program yesterday. It's not quite one click, but it's very easy. I still need to ssh into the account to set some directory permissions, but that's not a big deal. I guess I can make a shell script to take care of that (PHP can't do it because it doesn't run as a user and therefore doesn't have access to the file system outside the web root - and that's where I need to change some permissions.)
This morning I've tackled the last remaining big missing piece in the datamantic software. It's a little embarassing, because it's so obvious in hindsight that it needed this capability, but until now it couldn't handle floating point numbers. Items (and the metadata atoms of each item) can, in theory, be anything you want. That's the whole point of the system: flexibility. Except they couldn't be floating point numbers, only integers. Obviously for a business system we are going to often need atoms to hold monetary values. And unless we are okay with excluding cents (and of course we're not okay with that,) we need some float support.
And now it's in there, at least for adding new information. I still need to bring the edit script up to speed, but that shouldn't be too hard.
I really want to get this done today so that Monday I can shift focus away from the software and start back to dealing with the hardware side of things.
I still have never said anything explanatory about the new software. And it's not that I haven't tried. Explaining software is difficult. So I'm still hoping for a longer, more thorough post, but here's something shorter about what I am working on presently.
The software is basically done. This is what I wrote several months ago during the initial phase of my long absence from blogging. It is a descendant of the software that runs this site, but much more generalized in an effort to target small business websites (especially inventory-centric websites,) instead of just blogging. Where this site has posts and then a bunch of meta-data associated with each post (author's name, date, summary, comments, etc...) the new software allows you to define anything as the main post - these are called 'Items' in the new system - and then associate any number and kind of metadata 'Atoms' with each item. All items, and their associated atoms, can be specified through the same sort of browser based interface that I use everywhere.
So, in other words, this site works great for blogging, but not for much else. The new software could be set up to blog, but it can also handle the situation where, say, instead of blogging you want to have a website for your wine business. Now instead of blog posts you have bottles of wine as your main items. Each wine then has a bunch of metadata associated with it (grape types, producer, year, etc....) To do this before would require me to write a lot of code. Now I can do it all through a web interface.
I wish I could explain it better because it's pretty cool.
You specify what your item and it's atoms are going to be like. Text, binary data, numbers? Then the software asks you a few questions about your items and it's atoms (like, what sort of form elements will be necessary for adding and editing,) and it builds all the necessary user interface pieces for you. I'm not sure how clear that is, but that's the slick part. The posting and editing scripts are very generalized and they can deal with any data, no matter how it is structured.
The whole project was a matter of abstracting the software that runs this site. Boiling it down, but at the same time radically expanding it's scope. Freeing it from the conceptual mold of blogging, while maintaining the blog like ease of creating and editing right in the web browser.
And, like I said, it's largely done. It's already deployed behind a few sites that I will point to eventually. But how the sites look isn't really the main thing. It's how they run. It's how (hopefully) easy they are to build and maintain. That's the key.
So now I am working on polishing the whole package. This is something I've never done before. I have put the older software behind a number of different blogging type sites (and even tried to adapt it for several business sites,) but it is a monsterous pain in the ass to deploy. It takes me the better part of a day to set it up, and that's assuming everything goes right. There aren't any instructions (which even I need and I wrote the thing,) it's very unintuitive, and the whole thing is just a sprawling mass of weird hacks and dependencies. It runs well once you get it working, but it gives off a sort of "don't even breathe on it" vibe.
So I'm trying to fix that this time around. Yesterday, for instance, I wrote a program that automatically sets up the database for a new installation (of the new software.) If you can believe it, I've never had an automated way to do this. I would just fire up mysql in an ssh session and create all the tables by hand. That's fine if your just deploying one site, but my goal now is to be able to deploy lots of sites. Quickly and painlessly. So that means building the automated tools to do it.
I thought I could do it in a few hours yesterday morning. How hard could it be? Ended up taking 10 hours. And 6,000 lines of code (not that number of lines means anything - a better programmer could have done it in less.) But I now have an automated way to set up the rather complex database schema. And not just that, but I can also run it against an existing database and it will analyze every table and fix anything that is not right. This includes modifying create statements on table columns that are not formed correctly, adding missing table columns, as well as adding completely missing tables.
This brings us the the key point of my recent efforts. Maintainability. This is what I learned when I tried to put the old software behind more than one site. The basic problem is that I would make a fix to one installation, but then that fix would not get propagated to the other sites. So they quickly fell out of sync with each other, and then each one had to be maintained as it's own independent entity. As they say in the biz: this doesn't scale.
So the new mantra is centralization. And the database creation/updater program I wrote yesterday is the first part of that. If I need to make a change to the database now - say, I realize that the user table needs another column to store a contact phone number of users - I create that new column in the program I wrote yesterday, and then I run that program across *all* installations of the software. This will keep everything in sync. Building tools to do what you previously did by hand allows you to scale.
Next up is constructing a similar tool to keep the code in sync across installations. The goal is to make changes in one place (say, on the test server on my development machine,) and then when things are running correctly I want to be able to run one script, and have all the changes to both the database structure and the code itself be replicated across all installations of the software. In other words, as a one person business, I'm trying to make myself scale.
Updates have been sparse, to say the least. I'm just back from some time with my family on Cape Cod. But now it's August, and August is the month for me to get busy. So hopefully there will start to be some real progress, although I guess I have said this before.
The corporation is all set. I've got my little kit from the state. Wow, I didn't realize how much like a game it is. The corporate seal hand puncher thing (looks sort of like a hole punch, except instead of making a hole it creates a raised round corporate seal on a piece of paper,) is pretty cool. I almost expected there to be a secret decoder ring as well.
Now I just have to get the bank account (evidently the only time I will probably ever use the corporate seal,) and I'm ready to start buying hardware.
My thinking has changed a little bit on how to attack the problem, but the specific hardware and software choices have not changed too much. More on that soon.
To the people here who have contributed money, I apologize for the so far rather slow pace. But like I said, things really should pick up now.