Why I like Etsy’s site performance report

etsy code as craft

Etsy publishes a great tech blog titled Code As Craft.

Join 28,000 others and follow Sean Hull on twitter @hullsean.

I was recently sifting through some of their newer posts & stumbled upon their Q2 2015 Site Performance Report. It’s really in-depth, though not impossibly technical. Here’s what I liked.

1. Transparency to business & public

Show real performance to customers

The first thing I thought while reading, is the strong show of transparency. The blog is public, so it’s not just an internally facing document that shares with the company, but sharing with the wider world. True, presented as a technical post it may only appeal to a segment of readers, but it’s great none the less.

Show real performance to non-technical business units

I think this kind of analysis & summary also provides transparency to the business itself. Product teams, business operations & sales teams can all view what’s happening. Where are there problems? What is being done to address them?

Also: When hosting data on Amazon turns bloodsport

2. Highlighting change

Added pagination to the cart

One thing that popped out, was the discussion of pagination changes, that impacted page load times in the shopping cart. Page load times in the shopping cart are particularly crucial, because that’s where customers can “abandon” an order out of frustration.

Illustrating performance impact to product decisions

When product is evaluating that new feature, and they can see how changes affect performance, it better *sells* what all those engineering resources are being used for.

Related: 5 reasons to move data to amazon redshift

3. Where we don’t have data

We can’t analyze what data we haven’t captured

The report highlights that data around the shopping cart is new. That’s great because it highlights what the value collecting data offers, by providing new insights that were not available previously. This also pushes for more metrics collection & analysis as the business begins to see the value of all of this gymnastics.

Read: Is Amazon too big to fail?

4. Product tradeoffs

The discussion around the shopping cart performance also illustrates how the business makes product decisions. The engineering team can only build & write so much code. Deciding to spend time on pagination, means time not spent on some other new feature. Which is more valuable? Selling new feature A in one corner of the product, that customers may spend real money on? Or speeding up page load times on page B?

Also: Is Apple betting against big data?

5. Cleaner data

At a Look & Tell event, I heard Lincoln Ritter talk about Data as a product to the business.

When you expose a performance report like this to the business, an iterative process begins to happen. The company gains insight from the report, makes better decisions, and thus can spend more energy time & resources on clean data. Cleaner data in term means better reports, which produce better decisions & so on.

Also: What is venue analytics & why is it important?

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

When hosting data on Amazon turns bloodsport

reddit aws outage

There’s a strong trend to automation across the cloud. That’s a great thing for startups because it reduces operational headaches & lets them focus on building products.

Join 31,000 others and follow Sean Hull on twitter @hullsean.

But as that trend begins to touch the database tier, all sorts of complications emerge. Let’s take a look at some of the tradeoffs.

1. Database as a service trend

I was recently reading Baron Schwartz’s article on the trend to database as a service.

I work with a lot of venture backed startups & pay close attention to what’s happening in New York & SF. From where I’m standing I see a similar trend. As automation simplifies management across the application stack, from load balancers to web & search servers, the same advantages are moving to database management.

Also: How to automate MySQL analysis on Amazon RDS

2. How Amazon RDS helps

Amazon’s RDS offers firms a data solution for Oracle & SQL Server as well as MySQL. For those just starting, it offers a long list of advantages.

o quick push-button deployment in minutes
o standardized parameters settings that just work
o ability to scale up or down from the dashboard
o automated backups
o multi-az so you can sleep at night

This brings a huge advantage to startups. Many have a team of developers but aren’t large enough to need an operations team and can’t afford a dedicated database administrator.

Amazon is obviously helping these firms raise the bar. And that’s a good thing.

Related: RDS or MySQL 10 use cases

3. How Amazon RDS hurts

As you get bigger, your needs will grow too. You’ll have tens of millions of customers, and with more customers comes an even higher bar. Zero downtime becomes critical. It’s then that Amazon’s solution starts to become frustrating.

Unpredictable upgrades

MySQL upgrades on RDS are a messy activity. Amazon will restart the instance, backup the instance, perform the upgrade then restart again. Each of these restarts takes a few minutes. The whole operation may have you down for ten minutes. This becomes more frustrating when your hands are completely tied. You don’t know when or what will happen!

When you roll-your-own instance, an upgrade can be performed in a matter of seconds. No instance restarts are necessary and you can monitor the process to know exactly where you are. This is the kind of control you’re going to want if you have millions of customers relying on your site & uptime.

Unnecessary slow restarts

When you apply parameter changes on RDS, some require a MySQL restart. Amazon forces the whole server to restart, increasing this downtime from a few seconds (when you roll your own) to many minutes. And while some parameters can be changed online, Amazon can provoke some strange behavior that is not always predictable.

With the frequency of these types of changes, you’ll quickly grow tired and frustrated with RDS.

EBS Snapshots are not portable

As mentioned above Amazon uses it’s standard filesystem snapshot technology to perform backups. While this works well, it can be slow & unpredictable in a multi-tenant environment.

When you roll your own, you can take advantage of xtrabackup, and perform hot backups against your database with zero downtime. This is a real godsend. What’s more they are portable, and can be moved to any other server even ones not hosted in Amazon’s cloud!

Promoting a read-replica is slow too!

One feature that Amazon touts is creating copies or “read replicas” of your data. These are great and can facilitate easy copying of data. However promoting these again brings unnecessary restarts which are slow.

When you roll your own, you can promote a read-replica or read-only slave in seconds. A few seconds can seem invisible to end users, while minutes will be perceived as a real outage or downtime.

Read: Is zero downtime even possible with RDS?

4. Is migration an option?

So what to do? As I mentioned above, there are real advantages to startups deploying their first database. It really does help. I would argue for many it can be a good place to start.

If you’re starting to outgrow RDS and frustrated with the limitations, performance tuning headaches & unneeded downtime, luckily you have options.

Migrating off of RDS onto a physical server can be done in a number of ways.

o slave off of the master

Here you build a MySQL slave on a standard EC2 instance, with your RDS instance as the master. When you’re caught up, bring your site down temporarily. Reset the slave & set to read-write mode. Then point your webservers at your new EC2 instance and bring the site back up. If done carefully 10 to 20 seconds of downtime should be plenty.

Don’t forget to run through the process with a firedrill first!

o dump & import

Another way to move your data may be MySQLdump. This option would be slower & bring a lot more downtime, but possibly necessary in some cases.

Also: 5 Reasons to move data to Amazon Redshift

5. Speed: It’s the database

Fred Wilson says speed is the number one feature of a web application. If customers are frustrated & waiting, they may leave & not come back. On the web it can be everything.

Many firms are rushing to database as a service to simplify administration. While that’s wonderful at the beginning, as you grow performance will become more of a day-to-day concern. And when it does, the database is going to be big on your list of headaches.

Web application performance inevitably involves the database and while it does, your decision to choose database as a service may come into question. Don’t be afraid to bite the bullet and manage things yourself when that time comes.

Also: Is upgrading RDS like a shit-storm that will not end?

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

How 1and1 failed me

1and1 fail

I manage this blog myself. Not just the content, but also the technology it runs on. The systems & servers are from a hosting company called And recently I had some serious problems.

Join 31,000 others and follow Sean Hull on twitter @hullsean.

The publishing platform wordpress, as a few versions out of date. Because of that some vulnerabilities surfaced.

1. Malware from Odessa

While my eyes were on content, some russian hackers managed to scan my server & due to the older version of wordpress, found a way to install some malware onto the box. This would be invisible to most users, but was nevertheless dangerous. As a domain name with a fifteen year life, it has some credibility among the algorithms & search engines. There’s some trust there.

Google identified the malware, and emailed me about it. That was the first I was alerted in mid-August. That was a few days before I left for vacation, but given the severity of it, I jumped on the problem right away.

Also: Why I say Always be publishing

2. Heading off a lockout

I ordered up a new server from to rebuild. I then set to work moving over content, and completely reinstalled the latest version of wordpress.

Since it was within the old theme that the malware files had been hidden, I eliminated that whole directory & all files, and configured the blog with the newest wordpress theme.

Around that time I got some communication from 1and1. As it turns out they had been notified by google as well. Makes sense.

Given the shortage of time, and my imminent vacation, I quickly called 1and1. As always their support team was there & easy to reach. This felt reassuring. I explained the issue, how it occurred and all the details of how the server & publishing system had been rebuillt from the ground up.

This was August 24th timeframe. As I had received emails about a potential lockout, I was reassured by the support specialist that the problem had been resolved to their satisfaction.

Read: Do managers underestimate operational cost?

3. Vacation implosion

I happily left for vacation knowing that all my hard work had been well spent.

Meantime around August 25th, sent me further emails asking me for “additional details”. Apparently the “I’m going on vacation” note had not made it to their security division. Another day goes by and since they received no email from me the server was locked!

Being locked, means it is completely unreachable. Totally offline. No bueno! That’s certainly frustrating, but websites do go down. What happened next was worse.

Since I use Mailchimp to host my newsletter, I write that well in advance each month. Just like clockwork the emails go out to my 1100 subscribers on September 1st. Many of those are opened & hundreds click on the link. And there they are faced with a blank screen & browser. Nothing. Zilch! Offline!

Also: Why I use Airbnb chat even when texting is easier

4. The aftermath

As I return to connectivity, I begin sifting through my emails. I receive quite a few from friends in colleagues explaining that they couldn’t view my newsletter. I immediately remember my conversation with 1and1, their assurances that the server won’t be locked out, and that all is well. I’m thinking “I bet that server got locked out anyway”. Damn it, I’m angry.

Taking a deep breath, I call up 1and1 and get on the line with a support tech. Being careful not to show my frustration, I explain the situation again. I also explain how my server was down for two weeks and how it was offline during a key moment when my newsletter goes out.

The tech is able to reach out to the security department & explain things again. Without any additional changes to my server or technical configuration they are then able to unlock the server. Sad proof of a beurocratic mixup if there ever was one.

Also: Is Amazon too big to fail?

5. Reflections on complexity

For me this example illustrates the complexity in modern systems. As the internet gets more & more complex, some argue that we are building a sort of house of cards. So many moving parts, so many vendors, so many layers of software & so many pieces to patch & update.

As things get more complex, their are more cracks for the hackers to exploit. And patching those up becomes ever more daunting.

Related: Are we fast approaching cloud-mageddon?

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

What Deborah Tannen taught me about conversation & interruption

tannen you just dont understand

I was recently invited to attend a charity event in Washington DC. Dinner was a catered affair of 300 with a few senators & Muhammad Yunus there to talk about micro financing.

After dinner we broke up into some smaller groups, and had great conversations into the night. It was interesting to me as I don’t often rub elbows with lobbyists & political animals. While we were all talking, the subject of language came up, and in particular how different people’s styles affect how they come off.

Join 30,000 others and follow Sean Hull on twitter @hullsean.

I became really engaged, as this topic has always interested me. I was introduced to the ideas of Deborah Tannen. She’s a professor of linguistics from Georgetown University, and an expert on the topic.

Afterward, I went straight to my kindle & bought here seminal book “You Just Don’t Understand”.

Boy do I understand a lot more now.

1. Conversational style varies by culture & gender

Across cultures, from europeans to Asians, North to South Americans, conversational styles vary. Some pause longer between breathes, while others make briefer pauses. Some deem conversation more like judge & jury, where each should be afforded carefully the chance to take stage, while others prefer the casual chance to jump in, and constant overlap.

These differences lead to the sense of pushiness versus interest, interruption versus dominance. Interest versus boredom. Since all these cultures have a different style, it can get rather complicated interpreting someone’s intentions if you’re not from that culture.

What’s more these vary quite a bit between men & women.

Also: What I learned from Jay Heinrichs about click worthy blog titles

2. Report & rapport talk are different

Report talk is in public, perhaps at a lecture, or out with a large group of friends around the dinner table. There stories & conversation revolves around a larger group.

Rapport talk on the other hand is at home, among intimates.

She says that women tend to prefer the latter while men prefer the former. So in different circumstances it can appear that one or the other has “nothing to say”, when it actually revolves around their preferences of when to speak.

Related: Is automation killing old-school operations?

3. Like & respect

Women’s behavior & style of speaking is rooted in the goal of being liked. So there are many cases where they may downplay themselves, to reach a more equal state with those around them.

Men’s behavior & conversational style is based around seeking respect. This can often mean emphasizing differences, and not parity.

Read: Do managers underestimate operational cost?

4. Contest or connection?

Men often see the world through the lens of contest, especially in relationships with others. Women on the other hand tend to see it as an interconnected network. By building bonds you strengthen that network.

These two styles inform dramatically different behaviors in similar situations.

Also: Is Reid Hoffman right about career risk?

5. Interest or independence

Here’s another example of how men & women may see things differently.

When men change the subject, women think they are showing a lack of sympathy — a failure of intimacy. But the failure to ask probing questions could just as well be a way of respecting the other’s need for independence.

So it seems styles & priorities inform intention & interpretation of a lot in conversations.

Although all of this doesn’t resolve or put to rest these differences, being informed can certainly help a lot towards understanding.

Also: What I learned from the 37 Signals team about work & startups

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

Are open source projects run like a democracy or an oligarchy?


I was reading Fred Wilson’s comments recently on The Bitcoin XT Fork. In it he discussed how open source developers manage their projects.

“A group of open source core developers are a democratic system.”

I was surprised by this comment because I had never thought if it as democratic. Here are my thoughts…

Join 28,000 others and follow Sean Hull on twitter @hullsean.

1. Indefinite tenure

Open source projects typically have a leader with indefinite tenure. He can’t be voted out. When developers are unhappy with how things are run, or how they’re evolving, they typically “fork” the project and go their own way.

That would be where Texas secedes from the union if they’re not happy with how things are run in Washington.

Also: Is the difference between dev & ops a four-letter word?

2. Inherited rights

Like an aristocracy, leaders of an open source project typically have rights inherited. This could be due to merit, or seniority. They are the ones with admin rights on the git account.

Divine rights indeed!

Related: Is automation killing old-school operations?

3. Appointments made by merit

Developers join open source projects, and move through the ranks mostly by merit. Sure there’s some back scratching, and massaging that helps too. Personality surely matters, but primarily skill at contributing code & architecture ideas are paramount.

Read: Do managers underestimate operational cost?

4. Power rests with a small elite

For sure, all people cannot vote on open source project direction. It’s a small group of elite, who are admittedly closest to it, and most knowledgable. These are the ones who control it’s direction.

Also: Is the difference between dev & ops a four-letter word?

5. Oligarchy or Aristocracy?

From wikipedia an Oligarchy is “a form of power structure in which power effectively rests with a small number of people”. That sounds closest.

While open source projects do have the indefinite political tenure of an authoritarian regime, they lack the strict obedience aspect.

However, open source projects do look a bit like an Aristocracy. Aristocracy is “a form of government that places power in the hands of a small, privilged ruling class. The term derives from the Greek aristokratia meaning ‘rule of the best’. ”

Also: Are SQL Databases dead?

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

Always be publishing

giraffe zebras

Join 28,000 others and follow Sean Hull on twitter @hullsean.

As an advisor to New York area startups & an long time entrepreneur, I’ve found writing & publishing to be extremely valuable use of time.

I follow the motto “Always be publishing” here’s why.

1. Form your voice

According to Fred Wilson, blogging has been one of the seminal decisions contributing to his success.

“It’s like Venus Fly Paper. When I write about topics that are relevant, suddenly anybody with a startup solution in that field will approach us. This works brilliantly.”

Also: 5 Things I learned from Fred Wilson & Mark Suster

2. Get in the conversation

The world online moves quickly and it can move in surprising directions. Hype, hysteria & buzz can direct the conversation as much as facts.

Getting into the conversation allows you to weigh in. This builds your credibility. As it puts you in the line of fire, you stand up & get heard.

Related: Is blogging crucial to career building?

3. Be in the line of fire

In sales there’s a saying, “always be closing”. It means always be in front of your customers, always be on point, always be getting deals done. That’s embodying your role as a salesman.

For builders, consultants, advisors, speakers & entrepreneurs, writing puts you directly in the line of fire. You express your opinions online loud & clear. Sometimes you will find critics picking apart your ideas. Sometimes they may correct you.

This process will help you hone your ideas. Strengthen some & modify & adjust others. All of it is good.

Read: Is building traffic & pagerank possible through active blogging?

4. Share your knowledge

As an advisor, entrepeneur or professional services consultant you sell your knowledge & expertise. Why not share a bit of that with the world at large.

This is one part good samaritan, and one part testimonial of your skill & style.

Also: Is Ryan Holiday about the internet & the death of journalism?

5. Learn by doing

Back in 2001 I wrote a book called Oracle + Open Source.

Along the way, writing chapter after chapter of material, there were times when I had to brush up on material. Or write & rewrite sections. Some of it wasn’t explained well, and other material I didn’t know as well as I needed to.

Today I intersperse howtos with writing on consulting, or industry trends. Inevitably a howto like Wrestling with bears or how I tamed Tungsten Replicator involves a lot of hands-on learning.

All of this is driven by blogging & publishing.

Also: Is the difference between dev & ops a four-letter word?

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

Why I use Airbnb chat even when texting is easier


If you’ve ever traveled & stayed with an Airbnb host, you know that once you book you can easily switch to text messaging. Sometimes this is easier. But as I found out, it’s smarter to stick with a channel that we all can share.

Join 30,000 others and follow Sean Hull on twitter @hullsean.

I had a similar experience with a recent consulting customer. The lesson was much the same.

Choose your communication channels wisely, for you may need them for other reasons later on.

1. Not what I paid for

I’ve been hosting travelers off & on through Airbnb for some time. It’s a fun past time, as you can meet some interesting people, and share a little bit of *your* city. That and there’s a little bit of extra income too, which doesn’t hurt.

One visitor I had wasn’t particularly happy with the setup. I’ve hosted dozens of people before, so I know that the space is popular to most. However this one guy seemed unhappy from the start. He didn’t read the fine print that it was a shared space with separate rooms. He was unhappy with the specific location too. And later he complained about a bicycle I had loaned him.

Also: Is Amazon too big to fail?

2. How Airbnb chat helped

At the end of his visit he asked for some of the fee to be refunded.

As I dug through our Airbnb chat, I copy/pasted our various communications, and in the end this helped clarify & remedy the situation. It also didn’t hurt that Airbnb themselves were there behind the scenes and could review all these messages as well.

Having a third party to arbitrate can make a big difference. Lets hope it doesn’t come to that, but if it does, you want a communication channel they can also see.

Related: 5 Reasons to move data to Amazon Redshift

3. Consulting engagements & corporate emails

Over the years I generally use my own email for projects & engagements. However recently I took on a longer engagement. At the start there was some insistence on using an internal email for communications. I was hesitant, but eventually conceded as it tied in with google calendaring and various internal aliases.

As the months went by, I tried & failed to use both emails for correspondence. It was a habit that was hard to change. What’s more forwarding *all* emails to my own was also difficult. With an ongoing barrage of messages numbering in the hundreds, it simply blew up my email account. That wasn’t sustainable either.

Read: Do managers underestimate operational cost?

4. After you leave

You may not be thinking of after your consulting assignment at the start of it. But you should be. You’ll engage in many communications, about a lot of different topics. Some about what is & isn’t in scope. Some about deliverables & timelines.

You’ll also have communications has things unfold, and as they are delivered. All of these are crucial to the engagement, as evidence of what was done when. If after you leave, all those emails are gone (at least that you can reach), it can be problematic.

What’s more once you set a precedent communicating one way, it’s hard to change habits. Best to set the precedent strongly up front.

Also: Are we fast approaching cloud-mageddon

5. Your channel is your paper trail

In todays mobile-heavy world, there are tons of channels we can use to communicate. From Whatsapp to Slack, Hipchat to email & text. They all have their strengths & weaknesses.

But sometimes we need to choose based on future needs. Leaving a paper trail can be important. Having future control over those past communications can bring legal benefit.

And all of these communications can help avoid misunderstandings if they’re available for review later.

Also: Which tech do startups use most?

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

Is Amazon too big to fail?

aws fault tolerance

Amazon is the huge online retailer everyone knows well. However there is another side of Amazon, namely Amazon Web Services that hosts many of the internets largest websites.

Join 28,000 others and follow Sean Hull on twitter @hullsean.

In the infrastructure & operations world, Amazon is the Citibank, JP Morgan or Goldman Sachs of cloud providers.

1. Outage takes down Yelp & Netflix

As reported on Thousand Eyes among other places, Amazon had a major outage yesterday.

Amazon experienced a problem with how they route data over the network. Routing is the technical term for how the internet moves data around. When routing goes wrong at a provider like Amazon, the websites they host will go down too.

Also: Are we fast approaching cloud-mageddon?

2. Automation can’t save you

Netflix is famous for their great streaming service, and shows like House of Cards.

On the technology side they’re also pretty famous. They deploy legions of Amazon servers to stream movies using Chaos Monkey. This open source suite allows them to remain resilient even if individual servers or components go offline.

Yet a heavy reliance on Amazon itself, meant a wider outage for them was also an outage for Netflix.

Related: What tech do startups use most?

3. Of cloud monopolies

Amazon’s dominance in the cloud hosting space is incredible. There are providers that can beat them in compute power, speed & price. But with their incredible reach of global datacenters & relentless growth they are still the first choice for most internet shops.

What is the downside of such dominance? What happened yesterday illustrates it clearly. When Amazon goes down, so do financial companies like Experian,

Read: Do managers underestimate operational cost?

4. Diversify your data portfolio

In the banking world we can put together legislation, regulating banks. We can enact capital requirements or consider breaking up the largest ones. For investors & consumers you can diversify your portfolio, putting money in different asset classes & institutions. If one fund fails, others will balance it out.

We can do the same with cloud hosting. For larger internet applications, deploying on multiple clouds can be very beneficial. In that case an outage at Amazon, would merely mean your global load balancer kicks in, sending traffic to your plan B servers.

Also: Replicate big data to Amazon Redshift with Tungsten

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

Replicate MySQL to Amazon Redshift with Tungsten: The good, the bad & the ugly

tungsten replicator

Heterogenous replication involves moving data from one database platform to another. This is a complicated endevour because datatypes, date & time formats, and a whole lot more tend to differ across platforms. In fact it’s so complex many enterprises simply employ a commercial solution to take away the drudgery.

Join 31,000 others and follow Sean Hull on twitter @hullsean.

Enter Tungsten, which supports these types of deployments, on platforms as Postgresql, Mongodb, Oracle, Redshift, Vertica. With custom built appliers the field is infinite!

With that I’ve set out to get things working with Amazon Redshift. If you’re still struggling with the basics check out Wrestling with bears or how I tamed Tungsten Replicator.

1. Connect to redshift

The first thing you’ll need to do is allow your Tungsten boxes to reach redshift. Seems obvious, but when you’re juggling all these apples & oranges for the first time, it may slip you mind.

Configure your AWS security group to allow tungsten boxes

Get the external IP address of your tungsten box. If it’s in DNS this will work even if ping doesn’t.

$ ping

Add to your Redshift security config. I created a special group called Tungsten and added the two tungsten boxes by IP address. That’s because these machines were on a different AWS account. If they’re on the same account, you could allow the entire EC2 group, and be done.

Install psql client

The best way I found to test the connection was psql. Install that:

$ apt-get install postgresql-client

Verify your connection:

$ psql -p 5439 -h --username=root -d dwh

Also: Are SQL Databases dead?

2. Configure S3 access

Tungsten uses S3 heavily to move data into Redshift.

(I outlined this previously in 5 Reasons to move data to Amazon Redshift.

Install s3tools package

Tungsten uses the s3cmd to interface with the Amazon S3 API. Let’s install that:

$ apt-get install s3cmd

Now edit the .s3cfg file of tungsten user. Change

access_key = AAAAAAA
secret_key = BBBBBBB

Lastly edit the tungsten /opt/continuent/share/s3-config-redshift.json. There are four parameters.

"awsS3Path" : "s3://tungstenbucket",
"awsAccessKey" : "AAAAAAA",
"awsSecretKey" : "BBBBBBB",
"cleanUpS3Files" : "false",

Related: Is Oracle killing MySQL?

3. Create tables on Redshift

In a heterogenous environment, that is where source and destination databases are different platforms, Tungsten cannot create tables for you.

It will however, give you a helping hand in the process. Enter the ddlscan tool, which scans the CREATE TABLE statements on your source database, and generates them for your target platform.

For each table in source database, there will be a stage table in Redshift:

$ ddlscan jdbc:mysql://localhost:3306/test -user sync -db test -template ddl-mysql-redshift-staging.vm > test_stage.sql

$ cat test_stage.sql
SQL generated on Thu Jun 04 20:06:45 UTC 2015 by ./ddlscan utility of Tungsten

url = jdbc:mysql:thin://
user = sync
dbName = test


DROP TABLE test.stage_xxx_sean;
CREATE TABLE test.stage_xxx_sean
tungsten_opcode CHAR(2),
tungsten_seqno INT,
tungsten_row_id INT,
tungsten_commit_timestamp TIMESTAMP,
c1 VARCHAR(256) /* VARCHAR(64) */,
id INT,
PRIMARY KEY (tungsten_opcode, tungsten_seqno, tungsten_row_id)

And also a base table in redshift:

$ ddlscan jdbc:mysql://localhost:3306/test -user sync -db test -template ddl-mysql-redshift.vm > test.sql

$ cat test.sql
SQL generated on Thu Jun 04 20:06:51 UTC 2015 by ./ddlscan utility of Tungsten

url = jdbc:mysql:thin://
user = sync
dbName = test


DROP TABLE test.sean;
CREATE TABLE test.sean
c1 VARCHAR(256) /* VARCHAR(64) */,
id INT,

Lastly apply those scripts to your redshift database:

$ psql
dwh# \i file_stage.sql
dwh# \i file_table.sql

Read: Are we fast approaching cloud-mageddon?

4. Troubleshoot applier

Encountered “Delimiter Not Found” issue

This issue was mysterious and remains so a bit. What I did to fix it:

had an issue with the path, but fixed that:

  "awsS3Path" : "s3://tungstenbucket",

It was causing an interim bucket to be created. But that did not solve things.

Ok. So I hacked this a bit.

Anyone can help me troubleshoot what happened & why?

A. I skipped transactions

I brought the applier back online with this command.

trepctl -service redshift online -skip-seqno 1,1-100

B. I did lots of inserts & deletes on MySQL

I then did about 200 of these:

mysql> insert into test.sean values ('hi there', 20);
mysql> delete from test.sean where id = 20;

C. Now seeing data

dwh=# select * from test.sean;
                 c1                  | id 
 working......                       | 25
 hello sean i have an exclamation !! | 27
 hello sean i came from mysql        | 26
(3 rows)

I also set cleanupS3Files to false. Now I’m seeing files like this:

So that indicates all those INSERT followed by DELETES cleaned up things.

Also: How do I find entrepreneurial focus?

5. Test data & table changes

B. Tested INSERT

At first the csv files were getting cleanedup by Tungsten. I added this option to s3-config-redshift.json file:

"cleanUpS3Files" : "false",

Then the files are kept around so we can review them. An insert record shows up in S3 like this:

"I","417","1","2015-06-05 17:44:35.000","tungsten new csv file? ","33",null

C. Tested DELETE

A DELETE record shows up in S3 like this:

"D","419","1","2015-06-05 17:45:48.000",null,"26",null

D. Tested UPDATE

An UPDATE record shows up in S3 like this:

"D","420","1","2015-06-05 17:48:55.000",null,"31",null
"I","420","2","2015-06-05 17:48:55.000","changed message text for redshift+tungsten update","31",null


As mentioned previously, this is *NOT* supported. However after doing the ALTER, the applier does *NOT* go offline. Also there are no errors. That’s because Tungsten does not support these and will filter them in a heterogenous environment.

The applier *DOES* go offline, after you try a new INSERT. That’s because it gets a new record for INSERT that doesn’t match.

“trepctl status” shows the following:

pendingExceptionMessage: CSV loading failed: schema=test table=sean CSV file=/tmp/staging/redshift/staging0/test-sean-413.csv message=Wrapped org.postgresql.util.PSQLException: ERROR: Load into table ‘stage_xxx_sean’ failed. Check ‘stl_load_errors’ system table for details. (../../tungsten-replicator//samples/scripts/batch/redshift.js#145)

redshift# alter table test.sean add column c3 integer default null;

redshift# alter table test.stage_xxx_sean add column c3 integer default null;

Then I brought the applier back online:

$ trepctl -service redshift online

Then check the status. It should say ONLINE for state.

$ trepctl status
Processing status command...
---- -----
appliedLastEventId : mysqld-bin.000022:0000000000000566;-1
appliedLastSeqno : 424
appliedLatency : 300585.739
autoRecoveryEnabled : false
autoRecoveryTotal : 0
channels : 1
clusterName : redshift
currentEventId : NONE
currentTimeMillis : 1433878195573
dataServerHost :
extensions :
host :
latestEpochNumber : 0
masterConnectUri : thl://
masterListenUri : null
maximumStoredSeqNo : 424
minimumStoredSeqNo : 0
offlineRequests : NONE
pendingError : NONE
pendingErrorCode : NONE
pendingErrorEventId : NONE
pendingErrorSeqno : -1
pendingExceptionMessage: NONE
pipelineSource : thl://
relativeLatency : 304511.573
resourcePrecedence : 99
rmiPort : 10000
role : slave
seqnoType : java.lang.Long
serviceName : redshift
serviceType : local
simpleServiceName : redshift
siteName : default
sourceId :
state : ONLINE
timeInStateSeconds : 351940.007
timezone : GMT
transitioningTo :
uptimeSeconds : 600921.759
useSSLConnection : false
version : Tungsten Replicator 4.0.0 build 18
Finished status command...

Lastly, let’s see what’s in the table, fire up the postgresql shell and take a look:

dwh=# select * from test.sean;
c1 | id | c3
working...... | 25 |
hello sean i have an exclamation !! | 27 |
hello will i break? | 30 |
some more records | 32 |
tungsten new csv file? | 33 |
another tungsten csv file? | 34 |
changed message text for redshift+tungsten update | 31 |
(7 rows)

Also: Was Fred Wilson wrong about Apple?

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

Wrestling with bears or how I tamed Tungsten replicator

tungsten replicator

I just dove into Tungsten replicator very recently as I need to replicate from Amazon RDS to Redshift. I’d heard a lot of great things about Tungsten, but had yet to really dig my heels in.

Join 28,000 others and follow Sean Hull on twitter @hullsean.

I fetched the binary and began to dig through the docs. Within a day I felt like I was sinking in quicksand. Why was this thing so darn complicated? To my mind unix software is a config file simple.cfg, a logfile simple.log, a daemon simpled. Open some ports & voila you’re cooking.

Unfortunately for beginners Tungsten took a very different approach. Although they support .ini files to config, they seem to encourage these huge commands to “configure” which means generate a config file in Tungsten speak, and install which literally means install the software for you into /opt/continuent.

After various posts to the forums, and a lot of head scratching, I discovered the cookbook. At first I thought this referenced puppet or chef type cookbooks. But it’s something different. It allowed me to setup a very basic tungsten just to see things working.

Tungsten supports all sorts of “topologies”. For this example I am just doing master-slave. There are two nodes and each has mysql running on it. Node1 serves as the master, and runs the tungsten replicator (master node service). And node2 serves as the slave and runs the tungsten applier (slave node service).

Good luck and hope this helps others speedup the learning curve!

1. Download tarball (on master)

Note the forums indicated they may be moving off of google code. So this url may change.

$ cd /tmp
$ wget

Also: Why Airbnb didn’t have to fail

2. Expand the tarball (on master)

Use your vast unix skills to expand the tarball!

$ mv download.php\?file\=tungsten-replicator-oss-4.0.0-18.tar.gz tungsten.tgz
$ tar xvzf tungsten.tgz
$ mv tungsten-replicator-4.0.0-18 stage

Also: Is the difference between dev & ops a four-letter word?

3. Install MySQL (each box)

Hopefully you’ve done this before. Pretty straightforward:

$ apt-get install mysql-client mysql-server

Also: Are SQL Databases Dead?

4. Edit cookbook files (on master)

Inside /tmp/stage/cookbook you’ll need to make a few simple edits:


Edit NODE1 and NODE2 and comment out other lines.

export NODE1=ip-172-31-0-117
export NODE2=ip-172-31-1-188


export MASTERS=($NODE1)
export SLAVES=($NODE2)

export TUNGSTEN_BASE=/opt/continuent
export MY_CNF=/etc/mysql/my.cnf
export DATABASE_USER=sync

Also: 5 Things toxic to scalability

5. Create install directory (each box)

The /tmp/stage directory you created above is just a holding ground for the tarball. Continuent in it’s infinite wisdom wants to install itself. So, create a directory for it:

As root:

$ mkdir /opt/continuent
$ chown tungsten /opt/continuent

Then as tungsten:

$ touch /opt/continuent/testfile.txt

Also: How to deploy on Amazon EC2 with Vagrant?

6. Configure aws security groups

AWS security group permissions are key to getting any of this tungsten stuff working. And there are a lot of moving parts here. Ping is required for various tests, as is MySQL’s port. But you’ll also need to enable Tungsten History Log port 2112 and the replicator ports.

o ping ICMP from your group
o enable inbound 3306 from your group
o enable THL – inbound 2112 from your group
o enable RMI – inbound 10000, 10001 from your group

Also: Howto interview an AWS expert for managers, recruiters & engineers alike

7. Test mysql client from the other (each box)

$ mysql -h ip-172-31-1-188 -u root

$ mysql -h ip-172-31-0-117 -u root

If these hang, verify 3306 in your aws security groups. If you get a mysql error, be sure the “host” is appropriate. Check with:

mysql> select user, host, password from mysql.user where user = 'root';

For more info see Connect to MySQL in the Amazon public cloud.

Also: 5 Reasons to move to amazon redshift

8. Setup ssh auto-login (each box)

You need to be able to login to each box as a user “tungsten” without a password.

$ adduser tungsten
$ ssh-keygen
$ scp .ssh/ tungsten@ip-172-31-1-188:/home/tungsten/.ssh/authorized_keys

Be sure the authorized_keys file is 600:

$ chmod 600 .ssh/authorized_keys

Also: Did MySQL & Mongo have a beautiful baby called Aurora

9. Install ruby (each box)

You need to have ruby on both boxes.

$ apt-get install ruby

Also: Top interview questions for a MySQL expert – recruiters, managers & candidates

10. Create sync user (each box)

Both of your mysql instances will need a user that tungsten connects as. I called it “sync”.

mysql> create user 'sync'@'%' identified by 'secret';
mysql> grant all privileges on *.* to 'sync'@'%';
mysql> flush privileges;

Note, you may want to use a blank password at this stage, to eliminate that as a potential problem to debug. Also consider the security implications of ‘%’ and consider a subnet wildcard such as ‘10.0.%’.

Also: Myth of five nines why high availability is overrated

11. Enable binary log (each box)

Enable the mysql binary log with log_bin parameter. Also ensure that /var/lib/mysql is readable by the tungsten user, either by changing /var/lib to 655 or adding tungsten to the mysql group. Test this as well using less or cat.

Also: What is high availability and why is it important?

12. Update MySQL startup settings (each box)

Fire up your favorite editor and update /etc/mysql/my.cnf settings (both servers). The following are the main ones:

server_id = 1 (server_id=2 for slave)
sync_binlog = 1
max_allowed_packet = 52m
open_files_limit = 65535
#bind_address localhost (comment it to disable)

Note that changing innodb_log_file_size is tricky. You’ll need to stop mysql, rename old ib_logfile0 to ib_logfile0.old and ib_logfile1 to ib_logfile1.old. Then change the param in my.cnf and start mysql. Otherwise you’ll get errors on startup.

Also: Is the difference between dev & ops a four-letter word?

13. Run the installer (on master)

This step is fairly straightforward. If there are problems in this step, you’ll see ERROR lines in the output. Sift through them and resolve one by one.

$ ./cookbook/install_master_slave

Also: RDS or MySQL – Ten use cases

14. Check tungsten status (each box)

The “trepctl” utility allows you to check the current status. It does a lot more, but for now that’s enough. If you want to make it easier add “/opt/continuent/tungsten/tungsten-replicator/bin” to your path.

Also notice the line “state”. It should be ONLINE. If it says “GOING-ONLINE:SYNCHRONIZING” that likely means you didn’t open up the tungsten ports. You’ll need both RMI ports 10000 and 10001 as well as THL ports 2112. We all know how finicky AWS security groups can be. I’ll leave it to your own exercise to confirm those are open.


root@ip-172-31-0-117:/opt/continuent/tungsten/tungsten-replicator/bin# ./trepctl status
Processing status command...
---- -----
appliedLastEventId : mysqld-bin.000005:0000000000000321;235
appliedLastSeqno : 5
appliedLatency : 34824.086
autoRecoveryEnabled : false
autoRecoveryTotal : 0
channels : 1
clusterName : cookbook
currentEventId : mysqld-bin.000005:0000000000000321
currentTimeMillis : 1432840323683
dataServerHost : ip-172-31-0-117
extensions :
host : ip-172-31-0-117
latestEpochNumber : 0
masterConnectUri : thl://localhost:/
masterListenUri : thl://ip-172-31-0-117:2112/
maximumStoredSeqNo : 5
minimumStoredSeqNo : 0
offlineRequests : NONE
pendingError : NONE
pendingErrorCode : NONE
pendingErrorEventId : NONE
pendingErrorSeqno : -1
pendingExceptionMessage: NONE
pipelineSource : /var/lib/mysql
relativeLatency : 44450.683
resourcePrecedence : 99
rmiPort : 10000
role : master
seqnoType : java.lang.Long
serviceName : cookbook
serviceType : local
simpleServiceName : cookbook
siteName : default
sourceId : ip-172-31-0-117
state : ONLINE
timeInStateSeconds : 82860.917
timezone : GMT
transitioningTo :
uptimeSeconds : 82861.492
useSSLConnection : false
version : Tungsten Replicator 4.0.0 build 18
Finished status command...


tungsten@ip-172-31-1-188:/opt/continuent/tungsten/tungsten-replicator/bin$ ./trepctl status
Processing status command...
---- -----
appliedLastEventId : mysqld-bin.000005:0000000000000321;235
appliedLastSeqno : 5
appliedLatency : 42569.62
autoRecoveryEnabled : false
autoRecoveryTotal : 0
channels : 1
clusterName : cookbook
currentEventId : NONE
currentTimeMillis : 1432840348936
dataServerHost : ip-172-31-1-188
extensions :
host : ip-172-31-1-188
latestEpochNumber : 0
masterConnectUri : thl://ip-172-31-0-117:2112/
masterListenUri : thl://ip-172-31-1-188:2112/
maximumStoredSeqNo : 5
minimumStoredSeqNo : 0
offlineRequests : NONE
pendingError : NONE
pendingErrorCode : NONE
pendingErrorEventId : NONE
pendingErrorSeqno : -1
pendingExceptionMessage: NONE
pipelineSource : thl://ip-172-31-0-117:2112/
relativeLatency : 44475.936
resourcePrecedence : 99
rmiPort : 10000
role : slave
seqnoType : java.lang.Long
serviceName : cookbook
serviceType : local
simpleServiceName : cookbook
siteName : default
sourceId : ip-172-31-1-188
state : ONLINE
timeInStateSeconds : 1906.6
timezone : GMT
transitioningTo :
uptimeSeconds : 82882.334
useSSLConnection : false
version : Tungsten Replicator 4.0.0 build 18
Finished status command...

Also: Is zero downtime even possible with Amazon RDS?

15. Perform simple test

Create a table on the master mysql node.

mysql master> create database test;
mysql master> create table test.sean (c1 varchar(64));
mysql master> insert into test.sean values ('hi there');
mysql master> insert into test.sean values ('this should show up in tungsten thl file');
mysql master> insert into test.sean values ('new data on tungsten02 THL??');

Verify that the table is on the slave mysql node.

mysql slave> show databases;
mysql slave> select * from test.sean;
| c1 |
| hi there |
| this should show up in tungsten thl file |
| new data on tungsten02 THL?? |
3 rows in set (0.00 sec)


Also: Why are MySQL & other database experts so hard to find?

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters