Are you getting errors building Amazon lambda functions? Don’t fret I got you!

aws lambda python

This post has gotten popular. So I wrote a better easier guide to diving into serverless lambda here. It’s quicker. I promise!

Why does Amazon make lambda functions so hard to create? Well my guess is that when you live at the bleeding edge you should expect to get scrapes!!

Check out: How daily notes help me work better with clients

Everybody is trying to build lambda functions these days. And it’s no wonder. Once you get them running, Amazon takes care of all the infrastructure drudge work! So cool.

Join 32,000 others and follow Sean Hull on twitter @hullsean.

I’ve been spending some time trying to get answers out of AWS support, and let me tell you it’s no fun. Yes all this stuff is new technology, and nobody has expertise in it in the way say you might in Linux or Oracle or another technology that’s been around for a decade.

Still you’d hope the techs would have some clue. In the end it was a slog dealing with support, and I think I was the one teaching them!

I did find Matt Perry’s howto which is pretty good.

Hopefully my own notes can help someone, so read on!

1. No lambda_function?

The very first issue you’re gonna run into is if you name the file incorrectly, you get this error:

Unable to import module 'lambda_function': No module named lambda_function

If you name the function incorrectly you get this error:

Handler 'handler' missing on module 'lambda_function_file': 'module' object has no attribute 'handler'

On the dashboard, make sure the handler field is entered as function_filename.actual_function_name and make sure they match up in your deployment package.

If only the messages were a bit more instructive that would have been a simpler step, but oh well!

Also: Is Amazon too big to fail?

2. No module named MySQLdb

This is a very tricky one. I mean after all you just spent all this time building your deployment package specifically for lambda, so what gives??

"Unable to import module 'lambda_function': No module named MySQLdb"

Turns out when you use a virtualenv, files will be installed into proj/lib/python2.7/site-packages/ or lib64. However Lambda wants them in the root proj/ directory! So move them there. I know I know. Weird, but that’s what they want.

Related: When hosting on Amazon turns bloodsport

3. Can’t find libmysqlclient

If you’re using the MySQLdb library like I was, you’ll eventually bump into this error:

Unable to import module 'lambda_function': cannot open shared object file: No such file or directory

Turns out that /usr/lib/ needs to be COPIED from /usr/lib. Don’t do “mv” or your system won’t have the mysql lib anymore!

Related: Are SQL databases dead?

4. Use the Amazon Lambda environment

One thing the support pointed out is that AWS as *supported images* for lambda development.

After all the errors above were resolved, it’s not clear to me that the supported AMI’s are truly required. However if you’re hitting intractable problems building a properly lambda deploy, you might wanna look at building one of these boxes.

Read: Why dropbox didn’t have to fail

5. Build your lambda deploy package

Now let’s roll it all together. Here’s are all the steps to build your deploy package.

- SSH to the instance
- mkdir test
- virtualenv test
- source proj/bin/activate
- sudo yum groupinstall 'Development Tools'
- sudo yum install mysql
- sudo yum install mysql-devel
- pip install MySQL-python
- cd test
- emacs -nw
- add your code to that file
- save the
- mv proj/lib/python2.7/site-packages/* proj/
- mv proj/lib64/python2.7/site-packages/* proj/
- rm -rf proj/lib (don't need dist-packages in the deploy pkg)
- rm -rf proj/lib64 (don't need dist-packages
- zip -r *

Also: How to hire a developer that doesn’t suck?

6. Upload your code

Uploading your code via the AWS dashboard is fine when you’re first testing things. But after a while it’ll get tiring going in the front door.

Create a new lambda function by specifying the basics as follows:

aws lambda create-function \
--function-name testfunc1 \
--runtime python2.7 \
--role arn:aws:iam::996225510001:role/lambda_basic_execution \
--handler lambda_function_file.handler_name \
--zip-file file://

And when you want to update your function, do the following:

aws lambda update-function-code \
--function-name testfunc1 \
--zip-file file://

Also: How to deploy on EC2 with Vagrant

Good luck with lambda. Once you get past Amazon’s weak documentation it’s pretty cool to be in a serverless computing environment. Happy deploying!

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

Which tech do startups use most?

MySQL on Amazon Cloud AWS

Leo Polovets of Susa Ventures publishes an excellent blog called Coding VC. There you can find some excellent posts, such as pitches by analogy, and an algorithm for seed round valuations and analyzing product hunt data.

He recently wrote a blog post about a topic near and dear to my heart, Which Technologies do Startups Use. It’s worth a look.

One thing to keep in mind looking over the data, is that these are AngelList startups. So that’s not a cross section of all startups, nor does it cover more mature companies either.

In my experience startups can get it right by starting fresh, evaluating the spectrum of new technologies out there, balancing sheer solution power with a bit of prudence and long term thinking.

I like to ask these questions:

o Which technologies are fast & high performance?
o Which technologies have a big, vibrant & robust community?
o Which technologies can I find plenty of engineers to support?
o Which technologies have low operational overhead?
o Which technologies have low development overhead?

1. Database: MySQL

MySQL holds a slight lead according to the AngelList data. In my experience its not overly complex to setup and there are some experienced DBAs out there. That said database expertise can still be hard to find .

We hear a lot about MongoDB these days, and it is surely growing in popularity. Although it doesn’t support joins and arbitrary slicing and dicing of data, it is a very powerful database engine. If your application needs more straightforward data access, it can bring you amazing speed improvements.

Postgres is a close third. It’s a very sophisticated database engine. Although it may have a smaller community than MySQL, overall it’s a more full featured database. I’d have no reservations recommending it.

Also: Top MySQL DBA Interview questions

2. Hosting: Amazon

Amazon Web Services is obviously the giant in the room. They’re big, they’re cheap, they’re nimble. You have a lot of options for server types, they’ve fixed many of the problems around disk I/O and so forth. Although you may still experience latency around multi-tenant related problems, you’ll benefit from a truly global reach, and huge cost savings from the volume of customers they support.

Heroku is included although they’re a different type of service. In some sense their offering is one part operations team & one part automation. Yes ultimately you are getting hosting & virtualization, but some things are tied down. Amazon RDS provides some parallels here. I wrote Is Amazon RDS hard to manage?. Long term you’re likely going to switch to an AWS, Joyent or Rackspace for real scale.

I was surprised to see Azure on the list at all here, as I rarely see startups build on microsoft technologies. It may work for the desktop & office, but it’s not the right choice for the datacenter.

Read: Are generalists better at scaling the web?

3. Languages: Javascript

Javascript & Node.js are clearly very popular. They are also highly scalable.

In my experience I see a lot of PHP & of course Ruby too. Java although there is a lot out there, can tend to be a bear as a web dev language, and provide some additional complication, weight and overhead.

Related: Is Hunter Walk right about operations & startups?

4. Search: Elastic Search

I like that they broke apart search technology as a separate category. It is a key component of most web applications, and I do see a lot of Elastic Search & Solr.

That said I think this may be a bit skewed. I think by far the number one solution would be NO SPECIFIC SEARCH technology. That’s right, many times devs choose a database centric approach, like FULLTEXT or others that perform painfully bad.

If this is you, consider these search solutions. They will bring you huge performance gains.

Check this: Are SQL Databases Dead?

5. Automation: Chef

As with search above, I’d argue there is a far more prevalent trend, that is #1 to use none of these automation technologies.

Although I do think chef, docker & puppet can bring you real benefits, it’s a matter of having them in the right hands. Do you have an operations team that is comfortable with using them? When they leave in a years time, will your new devops also know the technology you’re using? Can you find a good balance between automation & manual configuration, and document accordingly?

Read: Why are database & operations experts so hard to find?

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

Scalability Happiness – A Quiet Query Log

Peter Van Allen - Pin Drop

Join 7500 others and follow Sean Hull on twitter @hullsean.

There’s a lot of talk on the web about scalability. Making web applications scale is not easy. The modern web architecture has so many moving parts. How can we grapple with the underlying problem?

Also: Why Are MySQL DBAs So Hard to Find?

The LAMP stack scales well

The truth that is half right. True there are a lot of moving parts, and a lot to setup. The internet stack made up of Linux, Apache, MySQL & PHP. LAMP as it’s called, was built to be resilient, dynamic, and scalable. It’s essentially why Amazon works. Why what they’re doing is possible. Windows & .NET for example don’t scale well. Strange to see Oracle mating with them, but I digress…

Linux and LAMP that is built on top of it, are highly scalable and dynamic to begin with.

Also: AirBNB Didn’t Have to Fail During an AWS Outage

Ok, so what’s this got to do with MySQL? Well a LOT.

The webserver tier, the caching layers like memcache & varnish, as well as the search tier solr. These all scale fairly easily because their assets are fixed. Or almost so.

The database tier is different. So what affects performance of a database server? Server size? Main memory? Disk speed? The truth is all of those. But

Also check out: The Sexiest New Feature of AWS Speeds Up EBS

After you setup the server – set memory settings and so forth, it’s a fairly fixed object. True there are parameters to tweak but on the whole there isn’t a ton of day-to-day tuning to do.

Well if that’s true, why does performance take a hit?! As applications grow, the db server slows down, don’t we need to tweak server settings? Do we need new hardware?

Read this: A CTO Must Never Do This

The answer is possibly, but 9 times out of 10 what really needs to happen is queries must be tuned.

In 17 years of consulting that is the single largest cause of scalability problems. Fix those queries and your problems are over.

The Elephant in the Room – Query Tuning

I was talking with a colleague today at AppNexus. He said, so should we do some of that work inside the application, instead of doing a huge UNION or a large JOIN? I said yes you can move work onto the application, but it makes the application more complex. On the flip side the webserver tier is easier to scale. So there are tradeoffs.

I said this:

By and large, if scalability is our goal, we should work to quiet the activity in the slow query log. This is an active project for developers & DBAs. Keep it quiet and your server will run well.

Also: Top MySQL DBA Interview Questions for Candidates, Hiring Managers & Recruiters

Yet I still talk to teams where this is mysterious. It’s unclear. There’s no conviction there. And that’s where I think DBAs are failing. Because this is our subject matter expertise, and if we haven’t convinced developer teams of this, we’re not working together enough. API teams aren’t separate from DBA and operations. Siloing technology departments is a killer…


As you roll out new code, if some queries show up, then those need attention. Tweak the code until the queries drop out. This is the primary project of scalability.

When should I think about upgrading hardware?

If your code is stable, but you’re seeing a steady line rising on load average of the server, *THEN* go up in hardware. Load average means cpu & disk are being taxed. The server can’t keep up.

Related: Should I use RDS or build a MySQL server on AWS?

Devops means work together!

I close with a final point. Devops means bring dev & ops together! Don’t silo them off in different wings. Communicate. DBAs it’s your job to educate Developers about scalability and help with query tuning. Devs, profile new SQL code, test with large datasets & for god sakes don’t use an ORM – it’s one of 5 things toxic to scalability. Run explain and be sure to index all the right columns.

Together we can tackle this scalability thing!

Get some in your inbox: Exclusive monthly Scalable Startups. We share tips and special content. Here’s a sample

Software Unit Testing – What is it and why is it important?

Software development is composed of individual components.  As developers are building these units, they build tests to verify them for correctness.  These tests can verify the environment, they can verify data, they can verify edge cases and include test harnesses.  In essence they verify that the code meets the design specification.

There are a few key advantages to the unit testing approach:

  1. Self-Documenting – The tests themselves provide a type of documentation for the system as a whole.
  2. Advances Refactoring – At a later date you may need to repair, rewrite or refactor portions of code.  Previously built unit tests provide a tremendous help to make sure your changes still meet the previous design specification.
  3. Simplifies Functional Testing – With unit testing as an ongoing concern, the final components will likely perform more reliably, and if not the tests & self-documentation may point to how or why they fail to meet some specification.

Sean Hull Quora Discussion – What is software unit testing?