SmugBlog: Don MacAskill

Syndicate content
Thought stream from SmugMug's CEO & Chief Geek
Updated: 6 hours 50 min ago

Great things afoot in the MySQL community

Tue, 12/23/2008 - 18:09

tl;dr summary - The MySQL community rocks. Percona, XtraDB, Drizzle, SSD storage, InnoDB IO scalability challenges.

For anyone who lives and dies by MySQL and InnoDB, things are finally starting to heat up and get interesting. I’ve been banging the “MySQL/InnoDB scales poorly” drums for years now, and despite having paid Enterprise licenses, I haven’t been able to get anywhere. I was pretty excited when Sun bought MySQL since their future is intrinsically tied to concurrency, but things have been pretty slow going over there this year.

But the community has finally taken up arms and is fighting the good fight. It’s (finally!) a great time to be a MySQL user because there’s been lots of recent progress. Here’re some of my favorites (and highlights of work left to do):

PERCONA

I can’t sing Percona’s praises enough. They’re probably the most knowledgeable MySQL experts out there (possibly even including Sun). Absolutely the best bang for the buck in terms of MySQL service and support - better than MySQL’s own offering. (If I had to guess why that is, I’d bet that MySQL/Sun don’t want to step on Oracle’s toes by fixing InnoDB - but >99% of what we need is related to InnoDB. Percona has no such tip-toeing limitations.) Let me quickly count the ways they’ve helped me in the last few months:

  • They knew of a super obscure configuration setting “back_log“. Have you ever heard of it? I hadn’t. But we started seeing latency on MySQL connections (up to *3 seconds*!) on systems that hadn’t changed recently (exactly 3 seconds sounded awfully suspicious, and sure enough, it was TCP retries). After going through every single kernel, network, and MySQL tuning parameter I know (and I know a lot), I finally called Percona. They dug in, investigated the system, and unearthed ‘back_log’ within an hour or two. Popped that into my configuration and boom, everything was fine again. Whew!
  • We have servers that easily exceed InnoDB’s transaction limits. Did you know InnoDB has a concurrent transaction limit of 1024? (Technically, 1024 INSERTs and 1024 UPDATEs. But INSERT … ON DUPLICATE KEY UPDATE manages to chew up one of each). I know all about it - I’ve had bugs open with MySQL Enterprise for more than 2 years on the issue. What’s more, these are low-end systems - 4 cores, 16GB of RAM - and they’re no-where near CPU or IO bound. It took MySQL months to figure out what the problem was (years, really, to figure out all the final details like the different undo logs for INSERT vs UPDATE). Their final answer? It’ll be fixed in MySQL 6. Note that 5.1 *just* went GA after years and years. On the other hand, it took Percona one weekend to diagnose the problem, and 13 days to have a preliminary patch ready to extend it to 4072 undo slots. Talk about progress! (And yes, we want Percona to release the patch to the world)
  • Solving the CPU scaling problems. These have been plaguing us for years (we have had some older four-socket systems for awhile … now with quad-core, it’s even worse), and thanks to Google and Percona, this problem is well on its way to being solved. We’re sponsoring this work and can’t wait to see what happens next.
  • XtraDB. This is the biggy. So big it deserves its own heading….

XTRADB

Oracle’s done a terrible job of supporting the community with InnoDB. The conspiracy theorists can all say “I told you so! Oracle bought them to halt MySQL progress” now - history supports them. Which is a shame - Heikki is a great guy and has done amazing work with InnoDB, but the fact remains that it wasn’t moving forward. The InnoDB plugin release was disappointing, to say the least. It addressed none of the CPU or IO scalability issues the community has been crying about for years.

Luckily, Percona finally did what everyone else has been too afraid to do - they forked InnoDB. XtraDB is their storage engine, forked from InnoDB (and then turbocharged!). We’re not running it in production yet, but we are running all of the patches that went into XtraDB and I can tell you they’re great. We’re sponsoring more XtraDB development (and yes, we made sure Percona will be contributing anything they build for us back to the community) with Percona, and I’m sure that’ll continue.

DRIZZLE

I’ve already blogged a bit about Drizzle, but it sure looks like Drizzle + XtraDB might be a match made in heaven. Drizzle can be though of as a MySQL engine re-write with an eye towards web workloads and performance, rather than features. MySQL 4.1, 5.0, and 5.1 added a lot of features that bloated the code without offering anything really useful to web-oriented workloads like ours, so the Drizzle team is ripping all that stuff back out and rethinking the approaches to the things that are being left in. Very exciting.

SSD STORAGE

The advent of “cheap enough” super-fast SSD storage is finally upon us. I’ve got Sun S7410 storage appliances in production and they’re blazingly fast. I have a very thorough review coming, but the short version is that even with NFS latencies, we’re able to do obscene write workloads to these boxes (let alone reads). 10000+ write IOPS to 10TB of mirrored, crazy durable (thanks ZFS!) storage is a dream come true. Once you mix in snapshots, clones, replication, and Analytics - well, it just doesn’t get much better than this.

(Don’t get sticker shock looking at the web pricing - no-one pays anything even remotely like that. Sign up for Startup Essentials if you can, or talk to your Sun sales rep if you can’t, and you can get them much cheaper. I nearly had a heart attack myself until I got “real” pricing. Tell them I sent you - enough Sun people read this blog, it might just help ).

STILL NEEDED…

So, all in all, there’s been an awful lot of progress this year, which is great. CPUs are finally scaling under InnoDB, and we finally have storage that isn’t bounded by physical rotation and mechanical arms. Unfortunately, great CPU scaling plus amazing IO capabilities isn’t something InnoDB digests very well. As is common in complicated systems, once you fix one bottleneck, another one elsewhere in the system crops up. This time, it’s IOPS. It was eerie reading Mark Callaghan’s post about this last night - I’d come to the exact same conclusions (from an Operations point of view rather than code-level) just yesterday.

Bottom line: Despite having ample CPU and ample IO, InnoDB isn’t capable of using the IO provided. You can bet we’ll be working with Percona, Google and Sun (read: sitting back and admiring their brilliant work while writing the occasional check and providing production workload information) to look into fixing this.

In the meantime, we’re back to the old standbys: replication and data partitioning. Yes, we’re stacking lots of MySQL instances on each S7410 to maximize both our IOPS and our budget. Fun stuff - more on that later.

UPDATE: Just occurred to me that there are plenty of *new* readers to my blog who haven’t heard me praise Google and their patches before. Mark Callaghan’s team over at Google definitely deserves a shout-out - they’ve really been a catalyst for much of this work along with Percona.

Categories: Technology

On Why Auto-Scaling in the Cloud Rocks

Tue, 12/09/2008 - 14:32

In high school, I had a great programmable calculator. I’d program it to solve complicated math and science problems “automatically” for me. Most of my teachers got upset if they found out, but I’ll always remember one especially enlightened teacher who didn’t. He said something to the effect of “Hey, if you managed to write software to solve the equation, you must thoroughly understand the problem. Way to go!”.

George Reese wrote up a blog post over at O’Reilly the other day called On Why I Don’t Like Auto-Scaling in the Cloud. His main argument seems to be that auto-scaling is bad and reflects poor capacity planning. In the comments, he specifically calls SmugMug out, saying we’re “using auto-scaling as a crutch for poor or non-existent capacity planning”.

George is like one of those math teachers who doesn’t “get it”. I was tempted not to write this post because he gets it so wrong, I’d hate to spread that meme. SkyNet auto-scales well. No humans at SmugMug are monitoring it and it just hums along, doing its job. Why is it so efficient? Because I understand the equation. I know what metrics drive our capacity planning and I programmed SkyNet to take these into account. It checks an awful lot of data points every minute or so - this isn’t simply “oh, we have idle CPU, let’s kill some instances.” (I would argue that, depending on the application, simple auto-scaling based on CPU usage or similar data point can be very effective, too, though).

SkyNet has been in production for over a year with only two incidents of note and SmugMug has more than doubled in size and capacity during that time without adding any new operations people. How on earth is this a bad thing?

Categories: Technology

First 1080p video from Canon’s new 5D MkII - Amazing!

Sun, 11/30/2008 - 15:47


My father and I got our 5D MkIIs on Friday and we could hardly wait for the batteries to charge. He took his to SF to test its vaunted low-light performance and posted this 60-second 1080p clip (along with other resolutions) on his SmugMug site: Click to watch it auto-sized for your monitor or check out the full 1080p resolution (caution - *high* bandwidth! UPDATE: Apologies if you tried to watch 1080p on Windows earlier. My bug made it look terrible. Try again, please?).

Here’s his story:

“I had seen Vincent Laforet’s amazing short film, but only in 720p. I knew what an amazing photographer he is and wondered how close an everyman like me could come to footage like that. Could the clips possibly hold up to viewing in 1080p?”

“So with only an hour’s practice shooting my dog licking peanut butter and the neighbor’s kids running in their yard, I left for the city to compare myself to a Pulitzer Prize-winning photographer with his helicopter, pricey stabilizer, models, set lighting, and post-production experts. I had a few hours and a tripod. What we had in common was the 5D.”

“At first, shooting video on the 5D makes you feel stupid. You’re holding the camera out in front while you look at the LCD on the back and use completely different buttons. I was always wondering if it was in focus, especially at the wide apertures I thought you probably needed at night.”

“How dark was too dark? I’d point at things that seemed impossibly dark, like the fishing boats you saw lit with mostly a string of Christmas lights on the bow of one. But I couldn’t tell how noisy the clips were on the viewfinder, so I held my breath and set the camera at ISO 3200. Why? ‘Cus it was lower than ISO 6400…”

“I had just one sekret weapon, same as Vincent: a Canon 200mm f/2.0 lens, not exactly an everyman item. It made a difference and I used it for maybe half the shots, including the opening clip of the couple kissing at Grace Cathedral, the rotating jewelry in the shop window, the hotel entrances, and the TV reporter. The city skyline was shot with an f/4 lens and it’s noisy. I also used an 85mm f/1.2 for scenes like the cable car, and toys in shop windows.”

“Dog and kid shots look amazing too, but I have to be honest: I missed many shots of fast-moving kids that I would have gotten with my video camera. Maybe I just need figure out how to juggle zooming, focus, and having the controls scattered across the back of the camera, but it felt like I needed three hands and the skillz of a Cirque du Soleil juggler.”

“So which camera for filming my grandkids? Now there’s a question… This calls for some serious 5D time to answer. Even my wife approves of that message.”

BTW, if anyone else out there is shooting 1080p video with cameras like this and would like their SmugMug Pro accounts to allow 1080p video, let us know. That feature is currently in beta, but we’d love to get a few more people using it.

Categories: Technology

Nominations open for The Crunchies 2008

Tue, 11/25/2008 - 15:09

SmugMug wins Best Design at The Crunchies 2007 by Luca Filigheddu Photography

I loved the idea behind The Crunchies even before we won for Best Design last year so I’m glad to see their triumphant return. And I see that nominations are open for 2008!

It looks like we’re probably eligible for a number of categories, but even if you don’t think we’re worthy, please go nominate your favorite startups. It really means a lot to the teams that work on these companies. Nothing like a little validation for all of our hard work…

Categories: Technology

I’m *not* speaking at Cloud Computing Expo 2008

Mon, 11/17/2008 - 20:18

Just a quick update, I was invited to speak at Sys-Con’s Cloud Computing Expo 2008 West (how’s that for a mouthful?) and accepted, planning on talking about SkyNet, S3, and our future use of cloud computing. Alas, my Inbox is so crazy, I failed to see the handful of emails the conference sent me asking me to sign a contract of some type. So I missed the deadline and they canceled my spot. (BTW, I can’t recall a conference ever asking me to sign something to speak, but this one does and I was full of FAIL.)

So, I’m sorry, despite being listed on the program, I’m *not* speaking there this week. It was my bad - I just missed the emails (as I miss so many emails these days). But still, a phone call from them wouldn’t have hurt, would it?

Who knows if I’ll be on their invitation list next year, but the conference will be great anyway, so have a great time without me!

Categories: Technology

Sweet new Sun storage stuff on Monday, Nov 10th

Sun, 11/09/2008 - 16:31

FYI, Sun is announcing some sweet new storage stuff on Monday at 3:30pm PT.

I’m reviewing a few of the things they’re announcing, and hope to publish my thoughts here soon (one of them joins my production network tonight if all goes well). However, I’m at Disneyland with my kids (first trip!) from Monday through Thursday, so I don’t know (yet) when I’ll be able to write them up. Bear with me if it takes a few days.

But the gear is exciting, and the direction Sun is headed is even more exciting!

Categories: Technology

Now is the time to build

Thu, 10/30/2008 - 22:13

Big Cats by micalngelo

“Every startup CEO is at least thinking about the need to cut back right now” - Michael Arrington

“We simply attempt to be fearful when others are greedy and to be greedy only when others are fearful.” - Warren Buffet

I’ll give you one guess as to which man I’m listening to. So no, not every startup CEO is cutting back. Apple spent their time innovating during the last downturn and look where it got them. I’m thrilled to have just passed out big, healthy profit-sharing bonuses to all of our employees this week for the 5th consecutive year. We think and hope they’ll be even bigger next year.

SmugMug was founded in the middle of the last “nuclear winter” in Silicon Valley. Everyone told us we were crazy, and we knew there was no chance at raising venture capital at a decent valuation, even with our impressive backgrounds. So we did what any good entrepreneur would do: We did it anyway, with both eyes firmly on our business model.

So if you’re running a startup, or thinking of creating one, take heart - downturns are a fabulous time to build and grow businesses. Focus on your revenues and your margins, not your growth rate or # of unique visitors. Find some stable income streams and a customer need. Listen to your customers and give them what they want - and what they’re willing to pay for. And take care of your employees - they’re your most valuable asset.

SmugMug is still hiring Sorcerers, Heroes, and all manner of other mythical beings capable of impossible feats. We filled our last position (quickly, I might add) with a *great* hire (and I’m still sorting through the avalanche of resumes we got to see if we can add a few more), but the job door is never closed at SmugMug for true superstars. Our philosophy is to not let anyone amazing get away, even if we don’t technically have an open position for you.

So if you can make magic and want to work for a company that takes crazy-good care of its employees, let us know.

Categories: Technology

Huge EC2 release: Load Balancing & Auto-Scaling!

Mon, 10/27/2008 - 10:57

June 5th, 2008 near Maryville, Missouri by Shane Kirk

In case you didn’t see it, Amazon had a huge EC2 announcement the other day that included:

  • EC2 is now out of beta.
  • EC2 has a SLA!
  • Windows is now availabled on EC2
  • SQL Server is now available on EC2

But the really cool bits, if you ask me, are the announcements about the next wave of related services:

  • Monitoring
  • Load Balancing
  • Auto-Scaling
  • A web-based management console

As frequent readers of my blog and/or conference talks will know, this means one of the last important building blocks to creating fully cloud-hosted applications *at scale* is nearly ready for primetime.

For those keeping score at home, my personal checklist shows that the only thing now missing is a truly scalable, truly bottomless database-like data store. Neither Elastic Block Storage (EBS) nor SimpleDB really solve the entire scope of the problem, though they’re great building blocks that do solve big pieces (or everything, at smaller scale). I’m positive that someone (Amazon or other) will solve this problem and I can start moving more stuff “to the Cloud”.

I can’t wait.

Categories: Technology

Live-tweeting Cloud keynote at PDC 2008

Mon, 10/27/2008 - 10:25

UFO OR CLOUD? by Shane Kirk

Microsoft is announcing some exciting Cloud Computing stuff today at their Professional Developers Conference (PDC). Assuming it’s the same stuff (and more?) I’ve been briefed on over the last year, it’s pretty exciting stuff.

I’ll be live-tweeting the best bits over on my Twitter account. If this stuff is interesting to you, come check it out.

Categories: Technology

Feedburner hiccup, sorry about that.

Mon, 10/13/2008 - 20:23

For some reason, Feedburner’s feed of my blog broke over the weekend. Not sure why, but I think I fixed it. Apologies for everyone who’s a few days lagged with my latest posts in their favorite blog reader - it was me, not you.

Categories: Technology

Amazon S3: Price reduction

Mon, 10/13/2008 - 18:25

I know a lot of you get your Amazon Web Services news from me, so I thought I’d better mention this one. It’s huge!!

Amazon announced S3 price reductions as you scale. For us, since we’re way beyond 500TB, this is huge. And for any of you who are still in their first tier, it’s something to look forward to.

DevPay also got a significant new release, pricing-wise, recently, so if you’re interested in that, better check it out.

Thanks Amazon!

Categories: Technology

Canon 5D MkII footage is back up!

Mon, 10/13/2008 - 18:19

©Vincent Laforet - Blog.vincentlaforet.com

Pulitzer Prize-winning photographer Vincent Laforet’s awesome Canon 5D MkII film, Reverie, is once again hosted at SmugMug in all its HD glory. I believe it’s only up for this week or something and then we have to take it down again, so you’d better go watch it while you have the chance.

See it auto-sized for your screen & browser or view it in Hi-Def. Your choice.

Don’t forget to check out the behind the scenes footage, too, also auto-sized for you or in full Hi-Def.

Enjoy!

Categories: Technology

ZFS & MySQL/InnoDB Compression Update

Mon, 10/13/2008 - 17:43

Network.com setup in Vegas, Thumper disk bay, green by Shawn Ferry

As I expected it would, the fact that I used ZFS compression on our MySQL volume in my little OpenSolaris experiment struck a chord in the comments. I chose gzip-9 for our first pass for a few reasons:

  1. I wanted to see what the “best case” compression ratio was for our dataset (InnoDB tables)
  2. I wanted to see what the “worst case” CPU usage was for our workload
  3. I don’t have a lot of time. I need to try something quick & dirty.

I got both those data points with enough granularity to be useful: a 2.12X compression ratio over a large & varied dataset, and the compression was fast enough to not really be noticeable for my end users. The next step, obviously, is to find out what the best ratio of compression and CPU is for our data. So I spent the morning testing exactly that. Here are the details:

  • Created 11 new ZFS volumes (compression = [none | lzjb | gzip1-9])
  • Grabbed 4 InnoDB tables of varying sizes and compression ratios and loaded them in the disk cache
  • Timed the time (using ‘ptime’) it took to read the file from cache and write it to disk (using ‘cp’), watching CPU utilization (using ‘top’, ‘prstat’, and ‘mpstat’)

It quickly became obvious that there’s relatively little difference in compression between gzip-1 and gzip-9 (and, contrary to what people were saying in the comments, relatively little difference between CPU usage, either, in 3 of the 4 cases. The other case, though… yikes!). So I quickly stopped even doing anything but ‘none’, ‘lzjb’, ‘gzip-1′, and ‘gzip-9′. (LZJB is the default compression for ZFS - gzip-N was added later as an option).

Note that all the files were pre-cached in RAM before doing any of the tests, and ‘iostat’ verified we were doing zero reads. Also note that this is writing to two DAS enclosures with 15 x 15K SCSI disks apiece (28 spindles in a striped+mirrored configuration) with 512MB of write cache apiece. So these tests complete very quickly from an I/O perspective because we’re either writing to cache (for the smaller files) or writing to tons of fast spindles at once (the bigger files). In theory, this should mean we’re testing CPU more than we’re testing our IO - which is the whole point.

I ran each ‘cp’ at least 10 times, letting the write cache subside each time, selecting the fastest one as the shown result. Here they are (and be sure to read the CPU utilization note after the tables):

TABLE1 compression size ratio time uncompressed 172M 1 0.207s lzjb 79M 2.18X 0.234s gzip-1 50M 3.44X 0.24s gzip-9 46M 3.73X 0.217s

Notes on TABLE1:

  • This dataset seems to be small enough that much of time is probably spent in system internals, rather than actually reading, compressing, and writing data, so I view this as only an interesting size datapoint, rather than size and time. Feel free to correct me, though.
TABLE2 compression size ratio time ratio uncompressed 631M 1 1.064s 1 lzjb 358M 1.76X 0.668 1.59X gzip-1 253M 2.49X 1.302 0.82X gzip-9 236M 3.73X 11.1s 0.10X

Notes on TABLE2:

  • gzip-9 is massively slower on this particular hunk of data. I’m no expert on gzip, so I have no idea why this would be, but you can see the tradeoff is probably rarely worth it, even if were using precious storage commodities (say, flash or RAM rather than hard disks). I ran this one extra times just to make sure. Seems valid (or a bug).
TABLE3 compression size ratio time ratio uncompressed 2675M 1 15.041s 1 lzjb 830M 3.22X 5.274 2.85X gzip-1 246M 10.87X 44.287 0.34X gzip-9 220M 12.16X 52.475 0.29X

Notes on TABLE3:

  • LZJB really shines here, performance wise. It delivers roughly 3X faster performance while also chewing up roughly 3X less bytes. Awesome.
  • gzip’s compression ratios are crazy great on this hunk of data, but the performance is pretty awful. Definitely CPU-bound, not IO-bound.
TABLE4 compression size ratio time ratio uncompressed 2828M 1 17.09s 1 lzjb 1814M 1.56X 14.495s 1.18X gzip-1 1384M 2.04X 48.895s 0.35X gzip-9 1355M 2.09X 54.672s 0.31X

Notes on TABLE4:

  • Again, LZJB performs quite well. 1.5X bytes saved while remaining faster. Nice!
  • gzip is again very obviously CPU bound, rather than IO-bound. Dang.

There’s one other very important datapoint here that ‘ptime’ itself didn’t show - CPU utilization. On every run with LZJB, both ‘top’ and ‘mpstat’ showed idle CPU. The most I saw it consume was 70% of the aggregate of all 4 CPUs, but the average was typically 30-40%. gzip, on the other hand, pegged all 4 CPUs on each run. Both ‘top’ and ‘mpstat’ verified that 0% CPU was idle, and interactivity on the bash prompt was terrible on gzip runs.

Some other crazy observations that I can’t explain (yet?):

  • After a copy (even to an uncompressed volume), ‘du’ wouldn’t always show the right bytes. It took time (many seconds) before showing the right # of bytes, even after doing things like ‘md5sum’. I have no idea why this might be.
  • gzip-9 made a smaller file (1355M vs 1380M) on this new volume as opposed to my big production volume (which is gzip-9 also). I assume this must be due to a different compression dictionary or something, but it was interesting.
  • Sometimes I’d get strange error messages trying to copy a file over an existing one (removing the existing one and trying again always worked): bash-3.2# ptime cp table4.ibd /data/compression/gzip-1 cp: cannot create /data/compression/gzip-1/table4.ibd: Arg list too long
  • After running lots of these tests, I wasn’t able to start MySQL anymore. It crashed on startup, unable to allocate enough RAM for InnoDB’s buffer pool. (You may recall from my last post that MySQL seems to be more RAM limited under OpenSolaris than Linux). I suspect that ZFS’s ARC might have sucked up all the RAM and was unwilling to relinquish it, but I wasn’t sure. So I rebooted and everything was fine.

Conclusion? Unless you care a great deal about eking out every last byte (using a RAM disk, for example), LZJB seems like a much saner compression choice. Performance seem to improve, rather than degrade, and it doesn’t hog your CPU. I’m switching my ZFS volume to LZJB right now (on-the-fly changes - woo!) and will copy all my data so it gets the new compression settings. I’ll sacrifice some bytes, but that’s ok - performance is king.

Also, my theory that I’d always have idle CPU with modern multi-core chips so compression wouldn’t be a big deal seems to be false. Clearly, with gzip, it’s possible to hog your entire CPU if you’re doing big long writes. We don’t tend to do high-MB/s reads or writes, but it’s clearly something to think about. LZJB seems to be the right balance.

So, what should I test next? I wouldn’t mind testing compression latencies on very small reads/writes more along the lines of what our DB actually does, but I don’t know how to do that in a quick & dirty way like I was able to here.

Also, I have to admit, I’m curious about the different checksum options. Has anyone played with anything other than the default?

Categories: Technology

Success with OpenSolaris + ZFS + MySQL in production!

Fri, 10/10/2008 - 17:14

Pimp My Drive by Richard and Barb

There’s remarkably little information online about using MySQL on ZFS, successfully or not, so I did what any enterprising geek would do: Built a box, threw some data on it, and tossed it into production to see if it would sink or swim.

I’m a Linux geek, have been since 1993 (Slackware!). All of SmugMug’s datacenters (and our EC2 images) are built on Linux. But the current state of filesystems on Linux is awful, and it’s been awful for at least 8 years. As a result, we’ve put our first OpenSolaris box into production at SmugMug and I’ve been pleasantly surprised with the performance (the userland portions of the OS, though, leave a lot to be desired). Why OpenSolaris?

ZFS.

ZFS is the most amazing filesystem I’ve ever come across. Integrated volume management. Copy-on-write. Transactional. End-to-end data integrity. On-the-fly corruption detection and repair. Robust checksums. No RAID-5 write hole. Snapshots. Clones (writable snapshots). Dynamic striping. Open source software. It’s not available on Linux. Ugh. Ok, that sucks. (GPL is a double-edged sword, and this is a perfect example). Since it’s open-source, it’s available on other OSes, like FreeBSD and Mac OS X, but Linux is a no go. *sigh* I have a feeling Sun is working towards GPL’ing ZFS, but these things take time and I’m sick of waiting.

The OpenSolaris project is working towards making Solaris resemble the Linux (GNU) userland plus the Solaris kernel. They’re not there yet, but the goal is commendable and the package management system has taken a few good steps in the right direction. It’s still frustrating, but massively less so. Despite all the rough edges, though, ZFS is just so compelling I basically have no choice. I need end-to-end data integrity. The rest of the stuff is just icing on an already delicious cake.

The obvious first place to use ZFS was for our database boxes, so that’s what I did. I didn’t have the time, knowledge of OpenSolaris, or inclination to do any synthetic benchmarking or attempt to create an apples-to-apples comparison with our current software setup, so I took the quickest route I could to have a MySQL box up and running. I had two immediate performance metrics I cared about:

  • Can a MySQL slave on OpenSolaris with ZFS keep up with the write load with no readers?
  • If yes, can the slave shoulder its fair share of the reads, too?

Simple and to the point. Here’s the system:

  • SunFire X2200 M2 w/64GB of RAM and 2 x dual-core 2.6GHz Opterons
  • Dell MD3000 w/15 x 15K SCSI disks and mirrored 512MB battery-backed write caches (these are really starting to piss us off, but that’s another post…)

The quickest path to getting the system up and running resulted in lots of variables in the equation changing:

  • Linux -> OpenSolaris (snv_95 currently)
  • MySQL 5.0 -> MySQL 5.1
  • LVM2 + ext3 -> ZFS
  • Hardware RAID -> Software RAID
  • No compression -> gzip9 volume compression

Whew! Lots of changes. Let me break them down one by one, skipping the obvious first one:

MySQL - MySQL 5.1 is nearing GA, and has a couple of very important bug fixes for us that we’ve been working around for an awfully long time now. When I downloaded the MySQL 5.0 Enterprise Solaris packages and they wouldn’t install properly, that made the decision to dabble with 5.1 even easier - the CoolStack 5.1 binaries from Sun installed just fine.

Going to MySQL 5.1 on a ~1TB DB is painful, though, I should warn you up front. It forced ‘REPAIR TABLE’ on lots of my tables, so this step took much longer than I expected. Also, we found that the query optimizer in some cases did a poor job of choosing which indexes to use for queries. A few “simple” SELECTs (no JOINs or anything) that would take a few milliseconds on our 5.0 boxes took seconds on our 5.1 boxes. A little bit of code solved the problem and resulted in better efficiency even for the 5.0 boxes, so it was a net win, but painful for a few hours while I tracked it down.

Finally, after running CoolStack for a few days, we switched (on advice from Sun) to the 5.1.28 Community Edition to fix some scalability issues. This made a huge difference so I highly recommend it. (On a side note, I wish MySQL provided Enterprise binaries for 5.1 for their paying customers to test with). The Google & Percona patches should make a monster difference, too.

Volume management and the filesystem - There’s some debate online as to whether ZFS is a “layering violation” or not. I could care less - it’s pure heaven to work with. This is how filesystems should have always been. The commands to create, manage, and extend pools are so simple and logical you basically don’t even need man pages (discovering disk names, on the other hand, isn’t easy. I finally used ‘format’ but even typing it gives me the shivers…). zpool create MYPOOL c0t0d0

You just created a ZFS pool. Want a mirror? zpool create MYPOOL mirror c0t0d0 c0t0d1

Want a striped mirror (RAID-1+0) w/spare? zpool create MYPOOL mirror c0t0d0 c0t0d1 mirror c0t0d2 c0t0d3 spare c0t0d4

Want to add another mirror to an already striped mirror (RAID-1+0) pool? zpool add MYPOOL mirror c0t0d5 c0t0d6

Get the idea? Super-easy. Massively easier than LVM2+ext3 where adding a mirror is at least 4 commands: pvcreate, vgextend, lvextend, resize2fs - usually with an fsck in there too.

Software RAID - This is something we’ve been itching for for quite some time. With modern system architectures and modern CPUs, there’s no real reason “storage” should be separate from “servers”. A storage device should be just a server with some open-source software and lots of disks. (The “open source” part is important. I’m sick of relying on closed-source RAID firmware). The amount of flexibility, performance, reliability and operational cost savings you can achieve with software RAID rather than hardware is enormous. With real datacenter-grade flash storage devices just around the corner, this becomes even more vital. ZFS makes all of this stuff Just Work, including properly adjusting the write caches on the disk, eliminating the RAID-5 write hole, etc. Our first box still has a battery-backed write-cache between the disks and the CPU for write performance, but all the disks are just exposed as JBOD and striped + mirrored using ZFS. It rocks.

Compression - Ok, so this is where the geek in me decided to get a little crazy. ZFS allows you to turn on (and off) a variety of compression mechanisms on-the-fly on your pool. This comes with some unknown (depends on lots of factors, including your workload, CPUs, etc) performance penalty (CPU is required to compress/decompress), but can have performance upsides too (smaller reads and writes = less busy disk).

InnoDB is notoriously bad at disk usage (we see 2X+ space usage using InnoDB) and while it’s not an enormous concern, it’d be something nice to curtail. On most of our DB boxes, we have idle CPU around (we’re not really I/O bound either - MySQL is a strange duck in that you can be concurrency bound without being either CPU or I/O bound fairly easily thanks to poor locking), so I figured I’d go wild and give it a shot.

Lo and behold, it worked! We’re getting a 2.12X compression ratio on our DB, and performance is keeping up just fine. I ran some quick performance tests on large linear reads/writes and we were measuring 45.6MB/s sustained uncompression and 39MB/s sustained compression on a single-threaded app on an Opteron CPU. We’ll probably continue to test compression stuff, and of course if we run into performance bottlenecks, we’ll turn it off immediately, but so far the mad science experiment is working.

Configuration

Configuring everything was relatively painless. I bounced a few questions off of Sun (imho, this is where Sun really shines - they listen to their customers and put technical people with real answers within arms reach) and read the Evil Tuning Guide to ZFS. In the end I really only ended up tweaking two things (plus setting compression to gzip-9):

  • I set the recordsize to match InnoDB’s - 16KB. zfs set recordsize=16K MYPOOL
  • I turned off file-level prefetching. See the Evil Tuning Guide. (I’m testing with this on, now, and so far it seems fine).

I believe since ZFS is fully checksummed and transactional (so partial writes never occur) I can disable InnoDB’s doublewrite buffer. I haven’t been brave enough to do this yet, but I plan to. I like performance.

Performance

This box has been in production in our most important DB cluster for two weeks now. On the metrics I care about (replication lag, query performance, CPU utliization, etc) it’s pulling its fair share of the read load and keeping completely up on replication. Just eyeballing the stats (we haven’t had time to number crunch comparison stats, though we gave some to Sun that I’m hoping they crunch), I can’t tell a difference between this slave and any of the others in the cluster running Linux. I sure feel a lot better about the data integrity, though.

Why not [insert other OS here]?

We could have gone with Nexenta, FreeBSD, Mac OS X, or even *gulp* tried ZFS on FUSE/Linux. To be honest, Nexenta is the most interesting because it actually *is* the Solaris kernel plus Linux userland, exactly what I wanted. I’ve played with it a tiny bit, and plan to play with it more, but this is a mission-critical chunk of data we’re dealing with, so I need a company like Sun in my corner. I find myself wishing Sun had taken the Nexenta route (or offered support for it that I could buy or something). Instead, we’ll be buying software service & support from Sun for this and any other mission-critical OpenSolaris boxes.

FreeBSD also doesn’t have the support I need, Mac OS X wasn’t performant enough the last time I fiddled with it as a server, and most FUSE filesystems are slow so I didn’t even bother.

Gotchas

  • On my 64GB Linux boxes, I give InnoDB 54GB of buffer pool size. With otherwise exactly the same my.cnf settings, MySQL on OpenSolaris crashes with anything more than 40GB. 14GB, or 21.9% of my RAM, that I can’t seem to use effectively. Sun is looking into this, I’ll let you know if I find anything out.
  • For a Linux geek, OpenSolaris userland is still painful. Bear in mind that this is a single-purpose box, so all I really want to do is install and configure MySQL, then monitor the software and hardware. If this were a developer box, I would have already given up. OpenSolaris is still very early, so I’m still hopeful, but be prepared to invest some time. Some of my biggest peeves:
    • Common commands, like ‘ps’, have very different flags.
    • Some GNU bins are provided in /usr/gnu/bin - but a better ‘ps’ is missing, as is ‘top’ (no, ‘prstat’ is *not* the same!), ’screen’, etc (Can anyone even use remote command-line Unix boxes without ’screen’? If so, how?)
    • Packages are crazily named, making finding your stuff to install tough. Like instead of Apache being called ‘apache’ or ‘httpd’, it’s called ‘SUNWapch’. What?
    • After finally figuring out how to search for packages to get the names (’pkg search -r Apache’ - which doesn’t provide pleasant results), I discovered that ‘top’ and ’screen’ just simply aren’t provided (or they’re named even worse than I thought). Instead, I had to go to a 3rd party repository, BlastWave, to get them. And then, of course, the ‘top’ OpenSolaris package wouldn’t actually install and I had to manually break into the package and extract the binary. Ugh.

Whew! Big post, but there was a lot of ground to cover. I’m sure there are questions, so please post in the comments and I’ll try to do a follow-up. As I fiddle, tweak, and change things I’ll try to post updates, too - but no promises.

UPDATE: One other gotcha I forgot to mention. When MySQL (or, presumably, anything else running on the box) gets really busy, user interactivity evaporates on OpenSolaris. Just hitting enter or any other key at a bash prompt over SSH can take many seconds to register. I remember when Linux had these sort of issues in the past, but had blissfully forgotten about them.

UPDATE: I went more in depth on ZFS compression testing and blogged the results. Enjoy!

Categories: Technology

Just so we’re clear - I love Canon :)

Wed, 09/24/2008 - 17:16

©Vincent Laforet - Blog.vincentlaforet.com

So you may have seen all the hooplah yesterday over Canon and Vincent Laforet’s amazing Canon 5D MkII footage. I thought maybe a little explanation was in order. First, a little background on me and Canon:

  • I, personally, am a monster Canon fanboy. I have a lot of cameras, and all of them - my collection of happy-snappys, our dSLRs, and even our video cameras - are Canon.
  • Our company is filled with Canon fanboys. We have more dSLR Canon bodies and lenses lying around than I can count.
  • The 5D MkII is the coolest camera I’ve ever heard of. Dozens of SmugMuggers have already pre-ordered them.
  • I’ve been dying to work with Canon since we started SmugMug. We’re a Top 500 website, we reach 6.5M people a month, our demographic is definitely high-end, and Nikon’s already in bed with Flickr. Sounds like a match made in heaven to me.

Ok, so now that I’ve set the stage, let’s talk about Vincent’s movie a little bit:

  • SmugMug had nothing to do with the production of the film. We didn’t even know it existed until we read this post on Vincent’s blog on Saturday afternoon.
  • The entire company caught fire. We lost our minds, we were so excited. Within minutes, we’d offered to provide *unlimited* HD bandwidth to Vincent. Bear in mind this was an unknown, but likely very large, cost with no real tangible upside. But we built this company because we love photography, video, and gadgets - and we’ve gotta stick with what we love.
  • Vincent enthusiastically took us up on our offer, and we all started brainstorming about how we could best release the film. Then we started brainstorming on how great this camera would be for indie photographers and filmmakers, and we lost our minds again. By Sunday morning, we had committed $25-50K to create a community-driven film using the Canon 5D MkII. (Note how fast things are moving - they were moving so fast, none of us had time to catch our breath).
  • We found out that Vincent had some awesome Behind-the-Scenes footage of the making of his film, Reverie, and so of course we offer to host that for free again.
  • The time for release arrived. Now, this entire time, we’ve never talked to anyone at Canon. As far as I knew, this wasn’t a Canon deal - Vincent clearly says Canon told him “You can then produce a video and stills completely independently from Canon U.S.A.”
  • We posted full HD versions of both Reverie and the behind-the-scenes footage for the world to see, crossing our fingers that our bandwidth bill wouldn’t be more than we could bear.
  • Our customers went bananas. Awesome! They’re thrilled we’re interested in this stuff, because they’re interested in this stuff. Ok, great, so maybe this bandwidth bill will pay of in goodwill.
  • The press went bananas - both mainstream and online. Awesome! They’re gaga over the user response and the remarkable camera.
  • We got busy (and I personally got busy) telling everyone, press and non alike, who called, emailed, tweeted, blogged, etc that the Canon 5D MkII is a game-changing camera the likes of which we haven’t seen before.
  • Canon asked Vincent to ask us to take Reverie down.

SAY WHAT?!

Canon asked Vincent to ask us to take Reverie down.

Being a Canon fanboy, I quickly complied - with a very heavy heart. I felt like I’d been kicked in the gut by one of my heroes. I felt betrayed. I also wrote a few things in the heat of the moment that came out harsher than they should have (and thankfully I didn’t publish what I’d original written - whew!). I’ve now edited my blog post and would like to apologize to anyone at Canon who I offended - I certainly wasn’t attacking Canon’s great employees, I was just lashing out.

But look at it from my point of view. I was risking an awful lot of money on bandwidth (I doubt it would have topped 6 figures, but easily could have been in the 5s) because I’m a camera geek and I love this stuff. Customer goodwill is fabulous, and we love generating it, but we were really doing this because we love the camera, love the passion that went into the film, and love to help our industry. We were hopeful that that goodwill would come back to us someday - but even if it didn’t, the chance to be a part of something as momentous as this film from this camera was worth it. And a good chunk of the company busted their butts over the weekend to make this happen. We could have been playing with our kids or out shooting photographs, but instead we spent the weekend setting things up for Vincent’s release.

And instead of appreciating how generous I thought we were being, and appreciating the monster amount of PR they were getting (better PR than any amount of money can buy), it felt like Canon was arbitrarily cutting us off for no good reason. I found myself asking “Well, if they want to host it on their pages, why don’t they just embed the video from SmugMug? Then they get it for free and we still get to be involved. It doesn’t even have to show our logo or anything - just use Quicktime but use a file from SmugMug’s servers. We’d save them money!”. We just wanted to be involved. And no-one at Canon called or emailed us at all - as I’m writing this, I’ve still never talked to anyone at Canon on this “independent from Canon” project.

In the cold light of the next day, though, I can see that I overreacted. It’s a sign of my passion for Canon and their products. No-one overreacts when some bad company does something stupid. But just look at Apple - the instant they make a mis-step (or even perceived mis-step), everyone is up in arms, ready to lynch Steve. Why? Because their products are so dang good, everyone’s super-passionate about them. So I let my passion get the better of me. I still wish Canon had wanted to work together, or at least let us be part of the project, but does it really matter?

I’m still buying a Canon 5D MkII and, I’m sure, lots of Canon goodies to go along with it. So what are you waiting for? Go get your own.

Categories: Technology

Amazing Canon 5D MkII HD video footage!!

Tue, 09/23/2008 - 01:13

©Vincent Laforet - Blog.vincentlaforet.com

Pulitzer Prize-winning photographer Vincent Laforet got his hands on a Canon 5D MkII for a weekend. Rather than shoot some quick stills, he rounded up an entire film crew and put them to work using the amazing 1080p video capture it offers - in helicopters, no less! When SmugMug heard about this, we went bananas and offered to host both the short film itself, Reverie (want it in HD?):

UPDATE: There should be an embedded video of the short film right here, and a link to the HD version. But there isn’t (anymore). Go check it out on Canon’s own website instead.

Meanwhile, you can see the Behind the Scenes footage (want it in HD?):

Then we went a little more bananas, and ponied up $25K to sponsor a community-created film led by Vincent, with another $25K to follow if other sponsors get on the train. We think this camera is truly a game-changer and we’re thrilled to help visionaries like Vincent prove it to the world.

Now, the astute geeks in the audience will note that Reverie isn’t hosted in 1080p, but instead is at 720p. I wish it weren’t so, and we’re actively trying to get our hands on the 1080p footage right out of Final Cut so we can let everyone take a peek - but it’s not our footage, so I don’t actually have it. I believe Canon may be putting it online themselves, but if they don’t, I’ll do everything I can to put it up - so stay tuned to Vincent’s blog as well as my own.

Man I love this industry! Thanks Canon!

Categories: Technology

Hot technologies I care about - Sep ‘08

Wed, 09/17/2008 - 17:49

photo by: ikegami

I’ve been too busy to blog lately, and for that I apologize.  But here’s a quicky detailing the technologies (internet related and not) I’m excited about right now:

  • Drizzle.  For years now, I’ve felt that MySQL has been doing in a direction in opposition to my use case.  Stored procedures, views, etc etc have added bloat and complexity without offering me anything useful.  Turns out I’m not alone - and thus Drizzle was born.  To say I’m *super* excited about this is a serious understatement.
  • Google & Percona’s MySQL patches.  While I wait for Drizzle, I’m stuck dealing with terrible concurrency issues in MySQL/InnoDB that force us to partition data way before we really should have to, making our system more complex.  It’s crazy having a server keel over when it shouldn’t be either CPU-bound *or* IO-bound but that’s life with MySQL and InnoDB these days - or at least, it was until Google and Percona fixed what I couldn’t get MySQL to fix with our Platinum Enterprise subscriptions.  Open source rules!
  • Flash storage.  I really wish I could talk about this some more (pesky NDAs), but there are datacenter changes coming that are more dramatic than anything I’ve seen in 14 years of working on them. I hope I’ve talked to everyone in the space (and from the companies I’ve talked to, one of them seems to be the *very* clear winner for this upcoming round), but if you’re a storage vendor working on flash appliances and I haven’t talked to you, ping me.  We’re a bleeding edge customer and we’ll put your stuff in production faster than you can deliver it to us.  :)
  • ZFS.  Regardless of flash storage, ZFS is the filesystem of choice - head and shoulders over everything we’ve used or heard of.  The advent of flash just makes this even more compelling.  The downside?  It’s not on Linux.  :(
  • OpenSolaris.  ZFS is so incredible, my hand has been forced, and we’re about to put our first OpenSolaris system into production.  OpenSolaris is, in theory, the Solaris kernel (think ZFS, DTrace, SMF, high concurrency, etc) with the GNU-like userland (think Linux-like).  In practice, it’s still extremely painful for a Linux expert and Solaris n00b like me to use - even on a single-purpose machine like a MySQL server.  Only ZFS makes the pain worth it.  For development, it’s basically unusable for Linuxers (it’s probaby fabulous for Solaris guys - lucky ducks).
  • Nexenta.  Unlike OpenSolaris, Nexenta *is* the Solaris kernel plus GNU userland.  Unfortunately, it’s not backed by Sun or anyone else I have any relationship with.  Sun has been absolutely the very best technology vendor we’ve ever dealt with in terms of support, technical knowledge, and just plain listening to us, so that’s a big issue.  I wish Sun had taken Nexenta’s approach (or would just buy them or offer support or something).  If OpenSolaris continues to be painful, we may fall back on Nexenta instead - remember, ZFS is the driving factor here.
  • Amazon Web Services competitors.  They’ve been promising they’d be coming out for years now and I’m shocked they’ve given Amazon this much runway.  But I believe a few more are getting very close (can’t say more, again, pesky NDAs).  Now, we’re extremely happy with Amazon, so we have no plans to switch, but competition is good for everyone - and Amazon is a fierce competitor.  Plus there are still gaps in Amazon’s strategy, and if I can mix & match to plug some of those gaps, awesome - sign me up.
  • Memcached.  This one’s been on my list for years, and it’s still way up there.  Binary protocol on the verge of shipping, nice patch to resolve some networking issues we’ve seen, and talk about scabability.  If you’re building web apps and this isn’t a core part of your infrastructure, you’re doing it wrong.
  • Big RAM.  4GB DIMMs are dirt cheap, so if you’re not loading your DB and Memcached boxes to the gills, you’re missing the boat.  Cheap 2-socket 64GB (and relatively cheap 128GB at 4-sockets) are here.
  • Sun Fire X4140 and X4440.  The best 1U (2-socket) and 2U (4-socket) servers on earth.  Despite being late to the game with quad-core, Opteron RAM performance kills Xeon, so these are the servers we’re buying.  You can load them to the gills with 4GB DIMMs, enjoy the dual-power supplies (yes, in the 1U box too), and crank out some great stuff.
  • OpenSocial, Y!OS, etc.  The big boys are finally getting real about getting open and cross-pollinating data and I think we’re finally nearing an inflection point.  We’re hiring a Sorcerer to do nothing but think and build in this space.  I’m sure magic will ensue.
  • Nikon D90 and Canon 5D MkII.  Nikon’s taken the photography world by storm with amazing high-ISO performance, and Canon just announced a DSLR that shoots full 1080p video.  Both look amazing and both are game-changers.
  • Onkyo TX-SR806.  I’m an A/V junkie and this thing is amazing.  5 HDMI inputs (need more?), THX Ultra2 Plus (the low-volume enhancements are *awesome* with young kids sleeping at home), automatic room EQ, decodes every modern audio encoding, etc.  I don’t even use the amplifier section (I have separates), but it’s turning out to be the best Pre/Pro I’ve ever owned.  Sounds fabulous on my gear.
  • iPhone App Store.  That thing is a game changer, and we’re barely seeing the tip of the iceberg.  All the other players have to respond - which is great for you and I.  And talk about a platform that’s a dream to develop on!
So there you have it.  Those are the most important pieces of tech I’m watching these days.  I’ll *definitely* be writing up our ZFS experiments as they come along and I have interesting data to share.  Stay tuned.     Oh, and if you’re curious about what I *wish* was on the list, there’s really only one thing:  iTunes syncing.  I have two desktops (one at my office, one at home) and two laptops, plus my wife has accounts on my computers.  Keeping those all in sync so that when I update a playlist at the office, the update is waiting for me at home, is a nightmare.  I’d pay lots of money if someone could solve that - seems like iTunes + AWS + a smart coder = solved, no?  Wish I had some time….
Categories: Technology

Job Opening: Social Sorcerer

Tue, 09/16/2008 - 20:35
          

photo by: Bill Evans Photography

How would you like to be the 8th Sorcerer here at SmugMug?  (We don’t hire engineers, programmers, or even coders - we only hire Sorcerers.  If you can’t work magic, I’m sure our competitors would love to see your resume…)

At SmugMug, everything we build is a direct result of customer feedback. We do very little, if any, competitive research - our customers keep us plenty busy. As a result, we’ve largely ignored social networking, especially outside of SmugMug. It just hasn’t been something our customers have asked for.

That’s changing. I’ve started getting tweets, blog comments, and forum posts about our “broken Facebook app”. Problem is, we don’t have a Facebook app.

The good news is we listen. So we’re ready to take the plunge. The geek in me has *always* wanted to dive into this stuff (and I’m the one who built and/or pushed us to build the building blocks we already have: an open API, Atom/RSS feeds, OpenID support, OAuth support, etc), so I’m thrilled we finally have the “ok” from my boss - our customers.

So if you’re high on social networking, particularly sharing photos anywhere and everywhere, we’d love to have you come work your magic. The job is extremely open-ended: You’d create our strategy, build our apps on other platforms, interact with our API developers who’ve already built some, and generally make it even easier for our customers to share their photos outside SmugMug. You’ll have to get your hands dirty - you’ll be writing the software (with the help of the other Sorcerers as needed), so managers and architects who no longer dirty their hands need not apply.

If that sounds like fun, we’re the best company to work for you’ve ever heard of (ok, this list sounds unbelievable, but I swear it’s all true):

  • We’re all super heroes.
  • We’re a privately held, profitable-for-years, fast-growing company (100%+ year-on-year for multiple years)
  • Fun projects. You choose what to work on rather than being assigned some fluff job. (I know, I know, unheard of - but I swear it’s true).
  • Small team. Your projects are your projects, not some multi-layer management effort doomed to fail.
  • Fast paced. Any week where we don’t do at least one software release is rare.
  • Large scale. Top 500 site. 350M+ photos, 800TB+ storage, 300M page views/month. Fun problems to solve
  • Big impact. Hundreds of thousands of paying customers and 6.5M+ visitors a month will use your work.
  • Family friendly. Full healthcare coverage, kids welcome for company meals and events. (Ex: We’re taking the whole company to Tahoe to ski & relax, including spouses and kids).
  • Distributed. Nearly 75% of our employees aren’t in Silicon Valley - they’re scattered all over the world, from Australia to Europe and a dozen US states.
  • Crazy benefits. We pay better salaries than the giants in Silicon Valley plus “early” stock options, profit sharing bonuses, matching 401k, 100% healthcare coverage for you and your family, gym memberships, iPhone 3G + minutes & data, 3G data cards, cable/DSL at home. Free drinks, free meals while working (new private chef too!). And more.
  • Great office. Walking distance to downtown Mountain View, across the street from train & lightrail, near Highway 85. 7.1 channel home theater, dual 30″ displays + Mac Pro + MacBook Pro/Air, jaw-dropping photography on the walls (and an in-house studio to shoot your own). Healthy cube/office decoration budget. (Ok, this is getting really fun to write )

Whew!  (Yes, I think our employees are our most valuable asset.  Can you tell?)

So, do you have what it takes? At the very least, you’ll need:

  • A passion for open data.
  • An understanding of how important privacy controls are.
  • Experience with web services, especially REST. SOAP and XML-RPC fans, this isn’t the place for you (but knowledge of black magic ain’t bad - just don’t practice it here!).
  • Modern scripting language experience (PHP, Python, Ruby). We use PHP (and so will you!).
  • History building apps (big or small) for platforms like Facebook, OpenSocial, etc.
  • Understanding of current and upcoming social networking technologies: OpenID, OAuth, microformats, etc
  • Experience with the SmugMug API a big plus.
If this sounds like your brand of magic, please contact us and let us know you’re our next Sorcerer.   If not, please tell your magic-working friends that the opportunity of a lifetime is right here…   Thanks!
Categories: Technology