Multiple scopes with validates_uniqueness_of in Rails

Previously, I talked about Validates_uniqueness_of and scope in rails

Well, what happens if you’ve got a case where, say you’re selling widgets, and you’ve got a join model set up with has_many :through relations, and you’re working with the following table:

(we’re listing stores that sell our widgets, and what size and price they sell them for)

Table: saleitems

Columns: widget_id, store_id, size, price

So, now, we want to validate that a widget can be sold at any store, and a widget can be sold many times at one store, but we can only sell a widget-store-size combo once. (You can’t sell a LARGE widget twice at the same store)

I think I was wrong, previously in my understanding. Now, I’ve written some code and see that you do the following:

You add a “validate_on_create” function to your saleitems model, and it looks like this:

def validate_on_create
   if Store.find_by_widget_id_and_size_and_store_id(widget_id, size, store_id)
   errors.add(‘you can’t do that’)

end

Thanks to: Rails Weenie for the answer

technorati tags:, , ,

Five Bad Reasons not to use Ruby on Rails

I’m long term PHP developer, and I’ve recently started using Rails.  No, wait.  I was pretty much a deeply entrenched long term PHP developer, and I’ve recently been converted to the Cult of Rails.  I’ve been writing PHP since 1997, back before it had its very own recursive acronym and was called PHP/FI (Personal Home Page / Form Interpreter).  I’ve used it on many, many projects, and I’ve gotten pretty good at making it do whatever the hell I want it to.  If you’re a programmer, you know this concept.  Once you work with a language enough, it becomes second nature — you don’t have to look things up, and you don’t have to think about the tools, you think about what you’re doing, and that makes you a whole lot more productive.

So, I also keep up with things in this Web 2.0 internet world.  I know how to wield Prototype and Scriptaculous, and Model View Controller (MVC) is nothing new.  There’s lots of fanfare about how Rails “brings everything together” and is “omg so great.”  I took some of it as zealotry and the rest as people new to doing web apps finding a tool that’s better, than, say, ColdFusion. (Note: I hate ColdFusion)  Part of this is because so many designers are doing coding with Ruby on Rails.  So, I’ve been watching from the sides, but I haven’t actually gotten into Rails.  Why?  I’ve written a list.  I think a lot of developers would be served by reading this
list and taking home some of the points.  Anyway, here goes.

Bad Reasons not to use Rails:

1) It’s not PHP or your other alternative favorite language.

This is huge.  There’s just something that feels wrong about having to look up how to print to the screen or write to a file.  It’s been so long since many of us have had to read a manual for anything but, “Oh, what’s the syntax on that command again?” that we’re uncomfortable having to do it again.  This is essentially supreme laziness.  You don’t want to learn something new (Ruby), so you’re going to come up with excuses not to.  Let me tell you something: In this business, you either keep learning new things or you become irrelevant.  Ruby is not a weird language.  You will not have a hard time with it.  Just get a good syntax guide and keep it handy.  I promise you, Ruby is not hard.

2) But I don’t have to learn Ruby, I can just use Symfony, CakePHP, PHP on Trax, or the like.

I tried this, see.  This was my excuse not to learn Ruby — “I’ll just use a Rails Clone,” I told myself.  Well, blearg to that.  I actually tried three of them before trying rails, and let me tell you something: none of them compare.  Whether it’s features or documentation, none of them are really there yet.  These kinds of flexibile MVC/ActiveRecord style frameworks are still in their relative infancy, and Rails in 6 months is going to be 6 months better than it is today.  The other projects are months and months behind Rails, and in 6 months, they may not even be to where Rails is now.  Especially with documentation.  ESPECIALLY WITH DOCUMENTATION.

Rails does not have good documentation.  This means that the other guys’ documentation is horrible, horrible crap.  Also, Rails has several books, which basically gives you somewhere to go when you need help.  There’s no CakePHP book.  There’s no Symfony book.  Developers of other projects: If you want traction, you NEED documentation.  Orders of magnitude better than you have now.  Person considering Rails:  you need documentation.  Seriously, for something as big as this, you can’t just fiddle around, you’re not writing it yourself, and it’s a huge pain in the ass to look under the hood every 10 seconds — if you can even find what you’re looking for.  Rails does a lot of things for you, and you have to know what keywords to type where.  You need docs.  Good ones.  This goes for any big framework system.

3) I tried, and it’s complicated.

We’re starting to get to something resembling real problems, now.  Yes, you’ve tried using Rails, and yes, it CAN be daunting.  You have to do things “The Rails Way” many times, if you want to actually benefit from the framework rather than fight with it.

First and foremost, the biggest thing you have to accept is ActiveRecord.  Basically, for me, this means that I have to think about my database structure a little bit.  You’ll probably have to learn how to use something called join models, and you’ll have to learn about “has many through,” which is how you bend ActiveRecord to do your bidding in many cases.  However, the biggest thing for me was getting away from the notion I had in PHP coding of making one FILE that included other files and could be called directly.  Rails hides all of this from you.  There’s one “entry point” into the system, and everything past that is all “pretty urls that get parsed and sent to code” so to speak.  You’re not writing pages anymore, you’re putting pieces together in a framework to build an application.  After a while, you get used to it, and it really does make you more productive.

4) Frameworks are not flexible.

You’re right.  Frameworks aren’t for everything.  However, if you’re like me, you’ll spend a LOT of time just getting simple CRUD (create, update, delete) going, writing some SQL by hand.  Even if you’re highly motivated and a fast coder, this takes a lot of time.  My point?  Even if you spend some time futzing around with bending Rails to your needs, you’ve got ample time to spare, because you didn’t have to do a lot of repetitive busywork.

Point two is — Rails is pretty damn flexible.  Most of my, “oh, i can’t do that in rails” moments are falling by the wayside.  As I become more familiar with rails, I learn more and what I had presumed were inflexibilities are really just gaps in my knowledge.  In other words, most of the so called inflexibility is more like, “I’d have to write code to do that…” Which is exactly what you’d have to do in any other environment.  It’s just that with rails, you get so used to having everything automated, you are reluctant to actually write any code.  Yes, that sounds horrible, and it’s partly because I’m still learning rails.

You see, with PHP and no frameworks, you’re essentially writing all the code yourself.  This is fine and dandy, and you can do pretty much whatever you want.  However, with rails, you have to do things within the framework.  You have to know where things go, how to add capabilities to the scaffolding that gets built for you.  In most cases, the framework isn’t inflexible, it’s that you have to have knowledge of the framework on top of knowledge of the basics (code, get, put, http, forms, html, etc).  You have to know a bit more, but you get LOTS in return.

So, inflexible? I dont think so.  Rails is still low-level enough to not put TOO much burden on you.  Extending a content management system like drupal could lead you to some inflexibilities.  Rails isn’t a CMS, it’s a framework for building web apps.  This is one time where a framework really isn’t locking you into that much.

5) It’s slow.

This is a problem with ruby on rails.  It’s interpreted.  There’s also another problem with an easy solution.  If you want to use rails with apache, you need to use either fastcgi or fcgid, both of which are kinda like daemons that run rails, keeping things running, so that when a user makes a call on the application, all of rails doesnt need to re-load.  This makes rails LOTS faster.  However, ruby is still an interpreted language, and it’s definitely NOT as fast as PHP + Zend + a good code cache.

This is probably the only good reason to not use Ruby.  However, 90% of your application will probably be just fine, and, if you really need to, you’ll be able to code up your whole app, then rewrite the hotspots in another language in less time than it takes to write the whole thing in PHP.

Additionally, there will probably be a Ruby compiler of some form out for use with rails at some point.  My guess is that at the pace the rails folks operate, we’ll see a ruby compiler out in less than a year.

If you need optimization NOW:

Anyway, one final thought.  I’m not giving up PHP, it’s still very useful.  However, I *have* drunk the rails kool-aid, and it tastes pretty good.

technorati tags:, , , ,

Flock is pretty cool

So i’m trying out flock, which is a new web browser based on firefox.  It’s got support for blogging, photos, delicious, rss reading, etc.

it seems pretty cool so far.

when i first heard about the project, months ago, i was very dismissive.  I thought that they were making a useless tool, that nobody would ever need anything more than firefox.  now, after playing with flock for 30 minutes or so, it seems really damn good.  I see exactly their vision, and I want to see MORE of it.  I know they can do a lot more with this project, and I’m excited about it.

I’m even more excited that firefox will see it and incorporate some of the better features back into firefox, making IT an even better product.

technorati tags:, , , ,

Rails and the MySQL Unique Index with validates_uniqueness_of

So I just found this feature of rails’ activerecord pattern when I was adding some unique indeces to my mysql tables. Basically, whenever you have a unique index, you also want to add some client side code to say, “hey, you can’t do that,” instead of barfing out an error message to the user.
This is handled with validates_uniqueness_of in rails.

Say we have a table, widgets that has two columns: id and name. Id is of course, our primary key, but say we want to ensure that name is unique at the database level, and we make name a unique key. Now, we must do some checking to ensure that the code never tries to insert a non unique name.

In rails, in your widget model, you’d simply add: validates_uniqueness_of :name.

This magic sauce makes rails check before inserting or updating. If you don’t want to have the overhead of an index, you could use this w/o an index, and it would still work.

validates_uniqueness_of provides lots of other magic, most notably the ability to specify a scope for your uniqueness. This is the true power of the feature, letting you do some really complicated things.

Say we had the same table, but all widgets come in multiple sizes, so we have id, name, and size. We can have a widget named Fizzlecutter 3000, but it can come in Small, Medium, or Large. We want to ensure that Fizzlecutter 3000 can be inserted multiple times, but only with different sizes. We do this by saying:

validates_uniqueness_of :name, :scope => size

You can have multiple uniqueness validators, to create compliated business rule sets.

Cool!

Update: I’ve corrected some of my misconceptions about using multiple scopes with one validates uniqueness of.  Read the article.

Optimizing Apache and MySQL for Low Memory Usage, Part 2

In Optimizing Apache and MySQL for Low Memory Usage, Part 1, I discussed some system tools and Apache optimization. I’ve also discussed mod_deflate, thttpd, and lighttpd in Serving Javascript and Images — Fast. Now, i’ll talk about MySQL.

Tweaking MySQL to use small amounts of memory is fairly straightforward. You just have to know what to tweak to get the most “bank for your buck,” so to speak. I’m going to try to show you the why instead of the what, so you can hopefully tweak things for your specific server.
We’ll look at the following MySQL types of mysql settings:

  • Things We Can Disable
  • The Key Buffer
  • The Table Cache
  • The Query Cache
  • Max Connections

Roughly, the amount of memory mysql uses is defined by a fairly simple formula: query_cache + key_buffer + max_connections * (other buffers). For a low volume site, query cache and key buffer are going to be the most important things, but for a larger site, you’re going to need to look at other things. Additionally, using the key buffer and the query cache are AMAZING performance increasers. I’m only showing you how to lower the amount of ram MySQL uses for if you’re trying to run a few smaller sites that don’t store hundreds of megs of data.

Things We Can Disable

First off, InnoDB requires about 10 megs of memory to run, so disable it. You shouldn’t need it if you’re going small. For those unfamilar, innodb is a different storage engine within mysql that you can use. It supports transactions and most importantly (to me, at least), row level locking. It’s a little bit slower than MyISAM, but it can greatly improve performance later. Basic example: changing a table in a MyISAM table locks the entire table. You can’t do any selects while you’re inserting. If you’re inserting a lot, this can be a problem. InnoDB lets you insert or update a row while still performing selects. It locks just the rows you’re working with, rather than the whole table.

You can disable InnoDB with “skip-innodb”

You can also disable BDB (berkely database, a deprecated alternative to InnoDB) and NDB, MySQL’s clustering database. Do this with “skip-bdb” and “skip-ndbcluster” I haven’t noticed skipping BDB and NDB to reduce memory much, but if you’re not using them, it can’t hurt.

The last thing you can skip is networking, with “skip-networking” I haven’t noticed this lower my RAM utilization, but if you’re not accessing mysql from a remote server, you should use the local unix socket to get better performance as well as better security. If you don’t have mysql listening on a TCP port, then you’re a lot less likely to get hacked. Also, for those of you who might be worried about having to configure PHP to connect to MySQL on the local socket, if you specify localhost as your hostname in mysql_connect() in php, it automatically uses the local unix socket, so there’s no need to worry.

The Key Buffer

This is probably the single most important thing you can tweak to influence MySQL memory usage and performance. The MySQL Reference Manual says about the key buffer:

Index blocks for MyISAM tables are buffered and are shared by all threads. key_buffer_size is the size of the buffer used for index blocks. The key buffer is also known as the key cache.

The maximum allowable setting for key_buffer_size is 4GB. The effective maximum size might be less, depending on your available physical RAM and per-process RAM limits imposed by your operating system or hardware platform.

Increase the value to get better index handling (for all reads and multiple writes) to as much as you can afford. Using a value that is 25% of total memory on a machine that mainly runs MySQL is quite common. However, if you make the value too large (for example, more than 50% of your total memory) your system might start to page and become extremely slow. MySQL relies on the operating system to perform filesystem caching for data reads, so you must leave some room for the filesystem cache. Consider also the memory requirements of other storage engines.

In other words, MySQL tries to put everything that’s indexed into the key buffer. This is a huge performance speedup. If you can get every table column in a specific select statement to be indexed, and your entire index fits into the key buffer, the SQL statement in question will be served directly from RAM. It’s possible to take that kind of optimization overboard, but if you are going for speed (not memory), that’s one way to do it.

I can’t say what size you should make your key buffer, because only you know how much ram you have free. However, you can probably get by with 2-3 megs here, bigger if you need it. If you want to play MySQL Memory Limbo (how low can you go!), you can look and see how much your key buffer is being used. Essentially, you’ll need to write a query that uses the SHOW syntax and uses the following equation:

1 – ((Key_blocks_unused × key_cache_block_size) / key_buffer_size)

This yields the percentage of the key buffer in use. After restarting mysql, let your site run a while and have time to fill up the key buffer (assuming it’s live. if not, simulate some use, first). Then, check the usage using the aforementioned equation. If you’re running below, say 0.8 or so, you can probably safely lower your key buffer size.

The Table Cache

MySQL seems to think that this one is the second most important thing to tweak, and it is. However, it’s really important for performance, marginally so for memory usage. In a nutshell, every time you access a table, MySQL loads a reference to a table as one entry in the table cache. This is done for every concurrent access of a table. So, if you have 10 people accessing your website simultaneously, and each of them is accessing a page that does a join across 3 tables, you’ll need to set your table cache to at least 30. If you don’t, MySQL will refuse to perform queries.

You can keep upping the table cache, but you’ll eventually hit a limit on the number of files your operating system can have open, so keep that in mind.

If you have table_cache set a little bit low, you’ll see the “opened_tables” server variable be high. It’s the number of times mysqld has had to open a table. If this is low, you’re never having any cache misses. If your table_cache is set too low, you’ll have cache misses, and you’ll hit the disk. If table cache is set TOO low, mysql will barf on you, and you don’t want that. In summary, hitting the disk occasionally is probably better than paging a lot, so find a balance, lowering table_cache to the point where you’re not hitting the disk on every query and also not using up memory unnecessarily.

The Query Cache
The Query Cache is essentially a mapping of queries to results. If you do the same query two times in a row, and the result fits in the query cache, mysql doesn’t have to do the query again. If you’re going for performance, this can be a huge benefit, but it can also eat up memory. If you’re not doing a lot of the same query, this probably won’t help you much. Chances are, it will help, and there’s probably some benefit for having a 500-1000 kb of query cache, even on a tight memory budget. There are three variables that influence how the query cache works.

  • query_cache_size – This is the total size of the query cache. This much memory will be used for storing the results of queries. You must allocate at least 40k to this before you get any benefit. There’s a 40k data structure overhead, so if you allocate 41k, it “works,” but you don’t have much space to actually get anything done.
  • query_cache_limit – This is the maximum size of an individual query that is cachable. If you have a 10 megabyte query cache, and a 1 megabyte query cache limit, you can have at least 10 one-megabyte queries cached. This is extremely useful to prevent big queries from busting your cache. Precise benchmarking probably will help you decide what’s best. Use your judgement here.
  • query_cache_type – Here, you can turn the query cache totally on or off. Also, if you want to get really sophisticated, you can turn it on or off — but enable or disable it for specific queries. If you want it to default on, leave it on, and disable the query cache for specific queries with something like, “select sql_no_cache * from table” — Alternatively, if you want it to default OFF, set query_cache_type to “2” or “DEMAND” and write queries that look like “select sql_cache * from table”

Maximum Number of Connections

This may or may not be a problem for you, but it’s one of the most important things for optimizing a mysql installation for high usage. If you’re already limiting the number of apache processes, then you’ll be fine. If you’re not, and you need to handle thousands of users simultaneously, you need to increase this number. It’s the number of connections MySQL allows at once. If it’s not set high enough, you’ll get the dreaded, “too many connections” MySQL error, and your users won’t be happy. You want to keep this number in sync with the max number of users apache allows, and you’ll need to budget extra ram for extra MySQL connections. See above for the rough formula used.
I’ll discuss a few more minor tweaks to MySQL in the next article, where I’ll discuss, among other things:

  • Persistent Connections
  • Other Buffers and Caches
  • Miscellaneous Options