9

I am looking to start developing a relatively simple web application that will pull data from various sources and normalizing it. A user can also enter the data directly into the site. I anticipate hitting scale, if successful. Is it worth putting in the time now to use scalable or distributed technologies or just start with a LAMP stack? Framework or not? Any thoughts, suggestions, or comments would help.

Disregard my vague description of the idea, I'd love to share once I get further along.

Nayan Jain
  • 178
  • 9

5 Answers5

8

Later. I can't remember who said it (might have been SO's Jeff Atwood) but it rings true: your first problem is getting other people to care about your work. Worry about scale when they do.

Definitely go with a well structured framework for your own sanity though. Even if it doesn't end up with thousands of users, you'll want to add features as time goes on. Maintaining an expanding codebase without good structure quickly becomes fairly horrible (been there, done that, lost the client).

btw, if you're tempted to write your own framework, be aware that it is a lot of work. My company has an in-house one we're quite proud of, but it's taken 3-4 years to mature.

Rob Agar
  • 11,794
  • 4
  • 41
  • 58
6

Is it worth putting in the time now to use scalable or distributed technologies or just start with a LAMP stack?

A LAMP stack is scalable. Apache provides many, many alternatives.

Framework or not?

Always use the highest-powered framework you can find. Write as little code as possible. Get something in front of people as soon as you can.

Focus on what's important: Get something to work.

If you don't have something that works, scalability doesn't matter, does it?

Then read up on optimization. http://c2.com/cgi/wiki?RulesOfOptimization is very helpful.

Rule 1. Don't.

Rule 2. Don't yet.

Rule 3. Profile before Optimizing.

Until you have a working application, you don't know what -- specific -- thing limits your scalability.

Don't assume. Measure.

That means build something that people actually use. Scale comes later.

S.Lott
  • 359,791
  • 75
  • 487
  • 757
  • Yes, if measurement and profiling is required, then scaling later makes complete sense. Otherwise what would you measure against? – Rimian Mar 13 '11 at 02:16
3

Absolutely do it later. Scaling pains is a good problem to have, it means people like your project enough to stress the hardware it's running on.

The last company I worked at started fairly small with PHP and the very very first versions of CakePHP that came out (when it was still in beta). Some of the code was dirty, the admin tool was a mess (code-wise), and sure it could have been done better from the start. But do you know what? They got it out the door before their competitors did, and became extremely successful.

When I came on board they were starting to hit the limits of their current potential scalability, and that is when they decided to start looking at CDN's, lighttpd caching techniques, and other ways to clean up the code and make things run smoother when under heavy load. I don't work for them anymore but it was a good experience in growing an architecture beyond what it was originally scoped at.

I can tell you right now if they had tried to do the scalability and optimizations before selling content and getting a website live - they would never have grown to the size they are now. The company is www.beatport.com if you're interested in who I'm talking about (To re-iterate, I'm not trying to advertise them as I am no longer affiliated with them, but it stands as a good case study and it's easier for people to understand what I'm talking about when they see their website).

Personally, after working with Ruby and Rails (and understanding the separation!) for a couple of years, and having experience with PHP at Beatport - I can confidently say that I never want to work with PHP code again =p

nzifnab
  • 14,852
  • 3
  • 46
  • 62
1

Funny to ask "scale now or later?" and label it "ruby on rails". Actually, Ruby on Rails was created by David Heinemeier Hansson, who has a whole chapter in his book labeled "Scale later" :)) http://gettingreal.37signals.com/ch04_Scale_Later.php

0

I agree with the earlier respondents -- make it useful, make it work and get people motivated to use it first. I also agree that you should pick off-the shelf components (of which there are many) rather than roll your own, as much as possible. At the same time, make sure that you choose components for your infrastructure that you know to be scalable so that you can go there when you need to, without having to re-write major chunks of your application.

As the Product Manager for Berkeley DB, I've seen countess cases of developers who decided "Oh, we'll just write that to a flat file" or "I can write my own simple B-tree function" or "Database XYZ is 'good enough', I don't have to worry about concurrency or scalability until later". The problem with that approach is that a) you're re-inventing the wheel (and forgoing what others have learned the hard way already) and b) you're ignoring the fact that you'll have to deal with scalability at some point and going with a 'good enough' solution.

Good luck in your implementation.

dsegleau
  • 1,952
  • 9
  • 13