13th February 2017

Some of the most capable engineers I know are scared of AWS systems like Lambda and Kinesis. They are ‘magical’, use GUIs and are difficult to debug. I hope that this post-mortem, which runs through the challenges, can help people feel more comfortable using these tools when needed.

Isn’t releasing an album a solved problem?

It’s 2016. You may think music labels have a way to deliver you digital music. But, outside of iTunes, Spotify and Amazon, you can’t really buy an album by a major artist and get a bunch of audio files. What if an artist wants another option?

A few weeks ago, I was tasked with building an integrations system that could support delivering an email with link to a zip file to customers who buy an album online. Scale was a big concern but a 5 minute wait time for the email was not an issue. That said, there was a complex set of requirements:

  • Each purchase (“Order”) should translate into exactly one email for the user. (Note: The email system can dedupe these using an ID)
  • Music is delivered via a link to a zip file. The zip file is generated in an external system.
  • The system will only receive ‘valid’ orders that will not error in the zip file or email system due to invalid data.
  • During times of high traffic, the system should not fall down but slowing down is OK.
  • The system should not require ongoing maintenance.
  • The system should use Node/Javascript since the team has no familiarity with Ruby or Python.
  • The system must be built in under 5 working days. (LOL music industry)

Principals of an event driven AWS integrations system

Using AWS is truly another universe. Systems like Lambda, SNS and Kinesis are similar to systems you may have built or encountered as an engineer. Sadly, experience with similar systems is not enough. Being an ‘ecosystem’, AWS requires a lot of specific ecosystem knowledge. An example of ecosystem knowledge that has no real basis in logic is that SNS and Kinesis can trigger Lambda workers but SQS cannot.

I can’t list all that specific knowledge here, but I’ve found that keeping a few high level principals in mind helps when fumbling around with the miraiad of AWS services.

1. Events should be replay-able

Replayability is key for a new system that could encounter unanticipated errors in production. Early on we knew this would be a requirement and so we wrote all code with replayablitly in mind.

For our album delivery system, ‘replayability’ means that we can execute a ‘cleanup’ script to re-run a selection of “Orders” through the system. The actual code for each worker is designed to handle a ‘rerun’ event. As the system runs, it updates a record in DynamoDb. That allows us to replay an order that has a Zip URL but errored sending the email —  it won’t generate a Zip URL, but will try to send the email again.

2. Be a team player

With the power of an infinitely scaleable AWS system in your hands, it is easy to forget about your less powerful friends. Part of the job of a good integrations system is to make sure all systems stay up (and if they go down, that you can recover). If your system puts unanticipated traffic on another system — and that system goes down — you fail.

For this system, we rely on two external APIs with unknown traffic constraints. We need to understand those constraints and build a system that can throttle to ensure they stay up.

3. Log all the things

Integration systems are very, VERY hard to understand. If you aren’t getting the result you expect, did you misconfigure something in the UI or did a worker error? If something is slow, what part of the system slow?

I highly recommend spending some time creating a CloudWatch dashboard for your project. Below is a screenshot of the one for this project. Note that each column corresponds to an individual service. The columns are ordered to match how order the data moves through the system (Ingester -> Kinesis…)

v0.1: Lambda + SNS

The first version of the system was based on one I had worked on at Citi Bike. We built a high throughput system for processing events from the Citi Bike system. The AWS setup went something like this:

[Queue] > [ElasticBeanstalk app inside VPC] > SNS > Lambda [fork] Multiple SNS > Multiple Lambda (storage, tracking, sending to other systems)

This setup works great if you are going into Amazon infrastructure — but not out. If one system exceeds their provisioned load, the others will either throttle (dynamoDb) or scale up (Lambda, SNS) to meet demand.

For this music project. We did not need to be in a VPC, were not handling as many events and did not need to do as many things with each event. We could be a bit more lightweight:

[Database of orders] > [Lambda fetching at 5 min interval] > SNS > Lambda that fetches Zip URL + sends email.

SNS ensures ‘at least once’ delivery but we unique in our external systems (noted above). The lambda workers can error gracefully when encountering duplicate events from SNS. No need to double check since the external systems provide those guarantees.

Sounds like this will work! But lets try to break it.

First lets check the exact throughput of all our systems. For inbound, the client believed that they could get 1,000 orders a minute or more during initial launch. Earlier, we had been informed that scale was not a problem for the email and zip url system. But, we decide to contact them directly to get exact numbers. The email system is fine but, we are informed that the zip url system can handle a maximum of 10 concurrent requests.

Generating each Zip url takes 3 consecutive API calls per url generated. In our initial tests, those consecutive calls take about 2 seconds total. That gives us a safe maximum of 5 concurrent lambda workers. Given that 2 second execution time, 1,000 orders a minute would translate into ~33 concurrent workers. UH OH!

SNS messages cannot be throttled and missed messages are dropped so, how can we throttle this system so that the Zip URL system does not go down?

v0.2 Lambda + Kinesis

Throttling SNS is sadly, impossible. We could throttle the number of notifications we send in the first place, but then we would be building a queue. We need to look at other solutions and potentially rebuild the system.

My initial plan was to create system like Resque (where we could have a queue with a fixed number of workers) but using Amazon SQS. While investigating SQS, I stumbled on a stack overflow post by Eric Hammond. Since SQS cannot drive lambda functions, he suggested using Kinesis to get the same result.

Unlike SQS, Kinesis streams can drive AWS Lambda functions. Kinesis streams let you specify how many AWS Lambda functions can be run in parallel (we used one worker per partition). So, we created a Kinesis with 5 partitions and that gave us 5 max concurrent lambda workers. But, there were some other issues we need to deal with.

For our purposes, the key difference between Kinesis-to-Lambda and SNS-to-Lambda is how the system handles errors.

  • SNS sends a notification to Lambda and does not retry if the lambda worker fails
  • Kinesis will retry failures as long as the data is still in Kinesis
  • Kinesis will throttle if there are lots of failures.

Our solution to those problems was to simply make sure that the lambda workers never errored, but instead saved everything to DynamoDb along the way. This way, we could have a separate worker that would find all ‘errored’ orders and retry them in a separate process.

Launch Problems

Two interesting errors collaborated to make for a rocky launch. We received valid orders in countries that, due to music rights issues, we could not actually send a zip url. Earlier in the process, we had encountered and handled errors in the Zip URL system. The system would catch the error, flag the order as ‘errored’ and move on to the next order. However, the node library we used for interacting with the Zip URL generation service did not call the error callback. Instead threw a parse error which caused the lambda worker to error, and then caused Kinesis to retry the order infinitely. Then, since it continued to error, caused throttling across our whole system.

We were able to see these events in our metrics and had clear logs from our Lambda workers. The fix was quite simple once we directed our attention to solving it. We deployed the fix and then re-ran previous orders to ensure nothing was missed.

ideas for v0.3

One of the most exciting technologies out right now is definitely API Gateway. That said, it isn’t something I felt I could learn and use within the time frame of this project. However, it does meet many of the requirements for the project such as replayability and throttling so might be an interesting exploration.

Read More
13th February 2017

…without going completely crazy in the process.

This post outlines a responsible way for developers to get around the Shopify 100 variant limit, and to understand the tradeoffs enough to sell the feature to clients.

As many, many people have noticed. You can only have 100 variants for any given product on Shopify. For probably 99.999% of shops, this is totally fine. While I don’t fully understand the rational for the limit, it exists and Shopify does a good job communicating the limit.

So what can we do if a client wants more than 100 variants for a product, but doesn’t have the money for a large Magento / Spree / Solidus build?

Enter us, the few, the proud, the 100 variant limit breakers.

The 100+ variant client

We have a client that wants to create a nice experience for buying custom sofas and chairs. There are three basic ways to customize the product: Fabric (8), Wood finish (4), and Size (5). In total, that gives us 160 variants per sofa or chair. Sadly, that puts us above the 100 variant limit. But, we may be able to get that number down.

The hope: Line item properties

Shopify recommends using ‘line item properties’ to get around the 100 variant limit. Line item properties are a way for you to add { key: value } pairs to a line item. But, line item properties cannot change the price. Also, there is no UI in Shopify admin for managing images that correspond to a line item property — like you might for a variant. This would be useful if the ‘Color’ property changes the image but not the price.

To help us see if we can use line item properties, I made a little chart for our client. Can we decrease the number of variants by using line item properties?

                  Fabric    |    Finish    |    Size  
Changes image:     yes            yes            no  
Changes price:     yes            no             yes

Sadly, each of our properties needs a variant.

Interestingly, we accepted this project because we misunderstood line item properties — thinking they could change the price. We basically went ‘OH SHIT’ and came up with our solution.

The Million Variant Solution

The solution is to break the product up into logical ‘sub-products’ that are less than 100 variants each and then group the ‘sub-products’ into collections.

For us, that meant breaking the product up by fabric material. We had 2 different materials, so that gives us ~80 variants per product. Below is how that might look in the Shopify Admin/products page for 2 sofas and a chair:

Individual products (‘sub-products’)

The most important part of setting up each ‘sub-product’, is that each option for each variant needs to be in EXACTLY the same order across each product. In our case, the order is: Size, Color, Finish.

If the values of a property are the same across ‘sub-products’, we need to make sure the values are EXACTLY the same. This allows us to de-dupe when we display them. For us, Size is always 80, 85, 90, 95, 100. The downside is that adding or changing a size now requires changing it across all ‘sub-products’.

Finally, we tag the products (“The Ludlow”) so that we can can easily group them later.

Product collections (products)

Using the tag we added above (“The Ludlow”), we create a new collection of products where the tag is equal to “The Ludlow”. For our site, this product collection will be our actual product. We can now create many different products that exceed the variant limit and easily display them on the site under this collection. Here we can control the public title, description and image for the product. People will never see pages for the individual ‘products’ — only this collection.

Displaying products with 100+ variants

Now that we have the backend setup, we need to display these ‘collection products’.

‘Collection products’ are our own invention and are really just regular collections that render using the template collection.liquid. We will be making some significant changes to the collections template but we want to continue to render other collections normally. Shopify has a nice way to change the template a collection uses under ‘Theme template’ on the collection page. Here we create a new template called collection.{type}.liquid and then change each ‘collection product’ to use this new template.

In this new template, we render all variants of all products in the ‘collection product’ in javascript. Note that options are accessed by index like option0 (size), option1 (color), option2 (finish). Sadly this is the only way to access options and is why keeping the option order consistent is so important.

 var allVariants = [  
   {% for product in collection.products %}  
     {% for variant in product.variants %}  
       {  
         id: “{{ variant.id }}”,   
         sku: “{{ variant.sku }}”,  
         size: “{{ variant.option0 }}”,  
         color: “{{ variant.option1 }}”,  
         finish: “{{ variant.option2 }}”,  
         price: “{{ variant.price | money }}”,  
         type: “{{ product.type }}”,  
         image: "{{ variant.image.src }}"  
       },  
      {% endfor %}  
   {% endfor %}  
 ];

For us, our UI was a bit more complex than merely select boxes so we could not use Shopify’s option_selection.js. I wrote a small backbone app to handle changes. You can check that backbone app out here. If you are familiar with Backbone, you should be able to easily modify it to suit your app.

The Tradeoffs

Weirdly, the most difficult part here isn’t the actual solution, but dealing with the repercussions. The reality is that Shopify is not designed for these ‘collection products’. Some big tradeoffs are:

Sorting

Creating a page for a ‘collection products’ with 100+ variants is only part of the problem. The next part of the problem is when you want to display these ‘collection products’ across the site in sortable groups. For example, we want to display products on the homepage in a certain order. Since our ‘products’ are really collections, this is very difficult. We need to hardcode this in the template. Adding or removing a sofa or chair now requires an engineer to update that custom code on the homepage. You can get around this with other hacky solutions using metafields, but metafields only work on products, not collections.

Brittleness

Additionally, option order and property names need to be consistent across each product in a product collection. This makes the site more brittle and difficult to edit for the client.

I hope it gives some examples of the tradeoffs to support products with 100+ variants. So, in short, it definitely is possible — and not that difficult really — it just complicates things a bit more.

Read More
13th February 2017

Live coding interviews test culture fit under the guise of testing competence.

I’ve always felt a bit of a tenuous relationships with live coding interviews. I love solving challenging problems, I love talking to people about software problems and I love writing code. So, why do live coding interviews feel so gross to me?

Well, curiosity got the best of me recently and I tried a few. While I can’t speak for other engineers, I can confidently say that live coding interviews are not a good way to vet my technical abilities or to understand the value I can deliver as a member of an engineering team.

I’m not a CS grad. But, I have written code professionally for the past 7 years and held a director level position at two different companies. While I’ve taken CS classes at MIT and worked as a researcher there for a few years, I didn’t get the interview training a CS grad might. Without that practice and training, I will fail a basic coding interview 100% of the time.

As many people have talked about, there is a skill to coding in front of people — a skill that does not necessarily correlate with skill at building maintainable systems. But, a coding interview is a structured test. Given the right role, could I get good at passing that test so that I nail an interview? Certainly. But, as a professional engineer, when will I have time to ‘git gud’ at technical interviews? Even if had time, no one has ever warned me that they are going to give me a coding interview. It’s a surprise!

“A wild coding interview appears!”

Live coding isn’t something I expect to do when an engineer asks to “chat”.

For my last “chat”, I met a founder and they wanted me to meet the lead engineer. Before the call, the engineer sent me a Google Doc with the text “in case we want to jot down ideas while we chat”. Turns out that was where we were going to do a coding interview. Wild!

That was an odd case but, usually I get invited to come by an office and “chat with some people on the team”. Maybe 70% of the time, I speak with various people on the team about problems and potential solutions. 30% of the time, a wild coding interview appears! “Live coding interview” casts some spell where I get put in a tiny conference room while a guy (always a guy) takes notes as I write code with a marker. I lose the battle.

Why not let someone prepare for a live coding interview? On the surface, I understand that you don’t want someone to prepare for a coding interview since they could look up common interview questions. But, by not telling people, you run the risk that some people are going to be more prepared than others— by chance. Coding interviews are a part of some engineering cultures and not others. That doesn’t make one more qualified than the other.

A live coding interview is a test that some candidates have practiced recently while other candidates have not practiced in YEARS. By surprising them, you favor people who have deeply practiced coding interviews in a CS undergrad program, had professional coding interview training at a bootcamp, or interviewed at bazillion companies (maybe they can’t hold a job?). If I had a friend at the company, they could give me a heads up — and then I could prepare. While a ‘surprise live coding interview’ may make sense to some, it does favor CS grads, people “in network” and people who can’t hold a job. Wild!

Live coding interviews are just as much about culture fit as the rest of the interview

As someone unfamiliar with live coding interviews, it is a space where I know I am being tested, but I don’t know what I am being graded on. Assumptions are cultural norms.

Without the grading mechanism stated and agreed upon by all parties, the interviewer can move the goal post. Or, the interviewee may optimize for something the interviewer doesn’t care about at all.

When you say ‘live coding interview’, I hear ‘we only hire CS grads’.

Here are some of my notes after my last coding interview — they have nothing to do with improving my engineering skills, and are all about improving how I perform in a live coding interview.

  • First, establish what the interviewer is looking for in the coding interview
    Usually when you ask directly what the interviewer cares about, they will deflect with ‘we just want to see how you think through problems’. Follow up with “how many questions will we go through in this process?” and “how much time do we have?”. The goal of these questions is to understand if the interviewer is evaluating problem solving speed, performance of the solution (n log n, n2 etc), ability to communicate as you code (a new skill!) or familiarity with the language you chose.
  • Ask to code in pseudocode
    In my last live coding interview, I was asked to write in a real language but could not use my text editor. Then, I was asked to not worry about making the code syntactically correct. That experience was very jarring and tripped me up a lot. When asked to choose, a language, I should respond by asking if they want to test my understanding of the language or if they are just trying to make me feel comfortable.
  • After writing the code, try to break the code in a way visible to the interviewer
    One of the challenges of live coding interviews is jumping between writing code and talking about code with the interviewer. For engineers who are not formally taught, this practice breaks their learned coding process, causing them to skip steps in live coding that they would not skip in a real coding environment. One of the steps I forgot was to try to break the code I wrote.

In conclusion

I’m far from a perfect engineer. I’m excellent at some things and absolutely terrible at others. Want to know what those things are? Talk to people I’ve worked with (or ask me!) but don’t pretend you can read between lines of code on a whiteboard.

Much like the SAT when applying for college, live coding is a structured test. I didn’t go to a school that trained me to do live coding, and so will probably fail the test. As I’ve experienced it, live coding isn’t the meritocratic space that it pretends to be. Live coding interviews weed out the people who are good at live coding interviews. Those people tend to be CS grads. If that is what you are looking for, add that to the job requirements and save everyone time. Please don’t pretend live coding is a good way to vet candidates on their technical chops.

Alternatives to live coding interviews

I’ve compiled a short list of some alternatives for testing technical competence. They aim to be inclusive of non-CS grads while being easy for an engineering team to implement fairly:

  • Review a project the candidate built
  • Talk through designing a system for doing ‘x’
  • Review a real pull request
  • Take home project (I probably won’t have time for it but could talk through it!)
  • Talk to people the candidate has worked with

In short, no more live coding interviews for me. For some in depth looks at alternatives, I highly recommend reading this post by Buffer about their engineering hiring process , this one about “Programmer Moneyball” — and hope that you find one I wrote about my experience hiring engineers at Artsy helpful too.

Read More
13th February 2017

I live in New York City or "the broken website everyone uses" as @xuhulk coined. Some days it is a truly marvelous place and other days, a marvel that anyone can even live here.

I’ve looked for a general list of New York’s infrastructure problems but I have not found even a single one. While I could certainly go on about the things that are great about New York, I want to be able to be honest about the city’s shortcomings. A list of infrastructure issues would setup a focused prompt for conversations about where to start improving NYC. We may not be able to agree about what to do about any of these issues, but hopefully we can agree on the current reality.

Here I’ve loosely documented some of the major pieces of physical infrastructure that are broken in New York City. The list does not include national problems that only incidentally apply to New York and focuses on issues that are best represented by our most populous and dense city. For the purposes of this piece, I’m defining infrastructure as "the basic physical structures needed to operate a city". Perhaps in a later list I can investigate the vast set of non-physical infrastructure systems like inequality, homelessness and schools.

I don’t want this list to be something needlessly negative but a list to help us New Yorkers focus. If we want to talk about what making New York better, where should we start? Wikipedia does a very good job putting a positive spin on New York City’s environmental issues here. While broader than infrastructure and overly positive, it is worth a read. I learned that while NYC is responsible for 1% of the US’s greenhouse gas emissions, it makes up 2.7% of the total population — making NYC significantly more environmentally friendly than cities such as Portland and SF.

Much like NYC, this document is intended to be a constant work in progress rather some authoritative unchanging thing. I am no expert on this, only an interested observer. If you have any recommendations please contact me or collaborate on the post on Github.


1. Trains

I’ve you’ve visited absolutely any major city outside of the US, coming back to the trains in New York is always a bit of a drag. Our unreliable and messy train system, while once a point of pride, is a bit of a sore spot with the city.

Timeliness: Trains run on time 74% of the time in NYC. [source]

Expansion: Until late 2015, the train routes had not expanded at all since WWII

  • The only recent expansion, the 2nd Ave subway, is taking a while

  • Plans for the 2nd Ave subway started in 1919 and initial funding was approved as part of a larger plan in 1929

  • Due to the great depression, WWII and funding challenges, digging on the 2nd Ave subway did not begin until 1972. Digging is expected to finish in 2016 and the stations are expected to fully open in 2029

Cost: The subway is 5–7x more expensive than subways in comparable cities

  • The 2nd ave subway costs about $2.3 billion per mile. Subway tunnels in other dense, first-world cities can vary widely in cost, but almost always are between $200 million and $1 billion per mile of route. [source]

  • The recently opened Fulton Center station is the 2nd most expensive train station in the world, second to the PATH station under the WTC. [source]

Old signaling (called ‘Fixed-block signals) make it difficult to know where trains actually are or how fast they are traveling. [further reading]

  • These old signals are why few lines have estimated arrival times for trains — due to these signals, no one knows where the trains are or how fast they are going.

  • CBTC is a modern technology the MTA is slowly rolling out. CBTC allows a central operator to know more precisely where trains are on the tracks and thus run them more closely together. [source]

  • CBCT was installed on the L & 7 but took 18 years to install.

  • Highly recommend watching this video on how CBCT and Fixed Block signals work

Other issues

  • Disaster preparedness is a real issue (see Sandy closings)

  • Little Wifi or even cell phone access on the subway

  • Heating/cooling in the stations

  • Track fires are an issue [source]

2. Trash

Unlike pretty much any other modern city in the world, New Yorkers are totally comfortable partying in trash. Walk around the Meatpacking district on a Saturday night (don’t do this) and next to all the Bentleys and Ferraris will be…huge piles of trash. While visibly inconveniencing the rich may be another point of pride in NYC, there are some more fundamental problems with trash that affect us all. [source]

  • Trash collection and processing costs 2.3b per year.

  • Roosevelt Island has a network of pneumatic tubes to aid in trash removal [source]

  • The 10 year #OneNYC plan seeks to address the trash problem but doesn’t provide much clarity around specific changes that will happen over the next 10 years.

  • The city only recycles 15 percent of residential waste [source]

  • Since closing the Fresh Kills landfill (due to local and EPA pressure) in 2001, there hasn’t been a local-ish way to dispose of ~50,000 tons of residential and commercial waste every single day.

  • Currently, 76 percent of the city’s residential trash is sent to landfills (in states like Pennsylvania, Ohio, Virginia and South Carolina), 14 percent is recycled, and 10 percent is converted to energy. [source]

3. Water

In spite of our ~100 year old water system that supplies 20 million people, we have some really fresh water. In NYC, our water comes from the Catskills and requires very little treatment. That said, we were teetering on potential water supply disaster for a while. Up until a few years ago, had a single 100 year old tunnel collapsed, we would have had 20 million people without water.

Supply: Until late 2013, NYC only had two water tunnels. If either tunnel failed, NYC would be out of water for 2–3 YEARS.

  • Construction on Tunnel #3 began in 1970 and it became operational in 2013 but is not complete. The final phase 3 and 4 of Tunnel #3 is expected to finish in 2020. 24 people have died building it.

  • Check out this write up about the story of building Tunnel #3

Tunnel #1 and #2 are ~100 years old and very leaky / broken but no one is sure how broken!

  • If we turn off Tunnel #1 or Tunnel #2 to repair, we don’t know if we will be able to turn them on again.

  • Since tunnel #3 is operational, the city plans to inspect and repair Tunnel #1 and #2 which have not been inspected since construction.

4. Sewage

Approximately 70 percent of New York’s sewers are combined. This means that household and industrial wastewater, rainwater, and street runoff -1.3 billion gallons daily- are all collected in the same sewers and conveyed together to the City’s 14 treatment plants.

  • This means that during floods, water from all these sources flows through the streets.

  • During heavy rains or snow, combined sewers can fill to capacity and are then unable to carry household and storm sewage to treatment plants. The mix of excess stormwater and untreated sewage must be released directly into the city’s waterways. [source]

Superfund sites on the National Priorities List such as Newtown Creek and the Gowanus Canal.

  • Among other issues, the Gowanus Canal has gonorrhea [source]

  • Newtown Creek contains a baffling 30 million gallons of spilled oil [source]

Flood risk has risen 20x since late 1884s [source]

  • Curious about what factors lead to this. Global warming?

5. Communications

Maybe one day we will have FIOS. Until then, almost all New Yorkers are stuck with Time Warner which has a (near?)monopoly on internet and cable delivery in the city. Many New Yorkers feel that due to this monopoly, Time Warner does not deliver quality service or competitive pricing (it has 1.5 stars on yelp).

  • Comcast and Time Warner are under investigation for slow internet speed [source]

  • Verizon FIOS rollout has not gone well. The city is currently investigating Verizon for "An egregious failure on the part of Verizon to deliver on the FIOS agreement.". [source]

  • Cell infrastructure is largely pretty good for most carriers but I haven’t found much information on it aside from consumer reviews.

6. Energy

Many veteran New Yorkers will remember the blackout of 2003. After looking more into that, I learned that energy supply (both electric and gas) are very well developed and modern in NYC. But, the pies and delivery mechanisms for ‘last mile’ service are very old and leaky. Very recently, these old leaky pipes caused an explosion in the east village injuring 19 people.

  • The city gets 98 percent of its electricity from natural gas. And natural gas meets 65 percent of heating needs. [source]

Marcellus natural gas pipeline

  • The pipeline, while it is a modern way to deliver natural gas, the natural gas is sourced through fracking which causes a wide array of environmental problems. [source]

  • Check out this 3 part series on the environmental and economic issues around fracking in New York.

  • New York State has banned fracking, but a large portion of our natural gas (not sure how much) comes from Pennsylvania [source]

Old Pipes: There are over 6,300 miles of very old natural gas pipes under the city. They leak and break causing explosions [source]

  • "In 2012 alone, Con Edison and National Grid, the other distributor of natural gas in the city, reported 9,906 leaks in their combined systems, which serve the city and Westchester County. More than half of them were considered hazardous because of the dangers they posed to people or property, federal records show" [source]

  • Con Edison experienced 83 leaks for every 100 miles of main in 2012. [source]

7. Roads

As an avid biker, my experience with the roads is largely dodging potholes. My only accidents in 6 years of biking in the city have been with the road, not with drivers or other bikers. Over those 6 years, I’ve seen the road quality generally improve and the number of bike lanes (and bikers) expand very rapidly. While I vividly understand the plight of bikers in the city, I do not personally understand the experience of drivers in the city of New York. As a result, this information may be particularly biased and incomplete.

Robert Moses

  • He is credited with building every road in the city with ‘expressway’ and ‘parkway’ in its name — including the BQE, the Cross-Bronx expressway and the Triborough Bridge.

  • Highly recommend the Bowery Boys podcast and reading The Power Broker.

  • All of his roads and bridges were designed to NOT accommodate city busses. Robert Moses felt that the poor, people who ride the bus, should not be on his roads or going to his parks. [source]

  • In addition to being classist, Robert Moses was also very racist. Moses wanted to "keep African Americans from using pools in white neighborhoods by making the water too cold." [source]

  • Since he had so much power, his classism and racism are built into the cities roads and lack of train infrastructure (he also stopped train development from 1930–1960).

Congestion is an issue in New York. Mayor Bloomberg tried to enact Congestion Pricing in the city but it failed when the economy collapsed in 2008. [source]

  • A lot of the traffic on Canal St. consists of trucks from Brooklyn, and Long Island on their way to New Jersey. Most could travel a more direct route over the Verrazano-Narrows Bridge into Staten Island, and then over the Goethals Bridge into New Jersey, but they do not. The reason? A one-way toll for westbound traffic across the Verrazano (the bridge is owned by New Jersey). Most simply skirt the fee to use the Manhattan Bridge-Canal Street-Holland Tunnel route for free. The result? Heavy pollution, and terrible congestion for downtown residents. [source]

Pedestrian safety and "The Rule of Two"

  • A driver has to be violating two misdemeanors at the same time when they hit a pedestrian for it to be the drivers fault [source]

  • The act of hitting the pedestrian or cyclist with right of way doesn’t count as one of the two violations. [source]

  • Unless those two violations are met, killing a pedestrian walking with the signal results in a maximum penalty of 30 days in jail and a $500 fine. [source]

  • Mayor de Blasio’s "Vision Zero" plan is showing mixed results [source]


There are certainly other infrastructure issues I haven’t listed here. I really struggled with where to draw the line on what constitutes ‘infrastructure’. When you think about it, virtually every social issue comes back to the infrastructure that supports or enables it.

Infrastructure issues could include the low number of public schools, lack of affordable housing units and limited space in homeless shelters. Inequality is a issues where NYC truly…shines. I struggled with whether to or how to include inequality as a topic in the list. But, I have not found a way to quantify inequality through its infrastructure related symptoms such as unoccupied luxury housing units or availability of affordable housing.

I hope this list is a helpful as a broad overview of some of the issues New York City faces. A thesis could be written on most of the individual bullet points in this list much less the larger themes. I’ve tried to link to insightful and readable articles to help the reader dig into issues they are interested in but I know even that gives an extremely incomplete picture.

As I mentioned above, this is meant to be collaborative work in progress. If you have any contributions or want to pose some changes to this list please contact me or collaborate on the post on Github.

Read More
13th February 2017

This is the start of a series about the lessons I learned while directing the web engineering team at Artsy. Artsy's vision is that art can be as popular as music, and its mission is to bring all of the world's art online. It has now raised over 50MM and has made it to that in-between phase where it is 'not a startup not yet a company'.

Being a part of that change was an incredible opportunity. While I was at Artsy, the company matured and grew from 15 to about 90 people. Since leaving, it has further matured and the staff count is over 100. Each of these people probably have their own list of things Artsy does right or wrong, but this series is my list. The hope is that this helps me (and maybe others) build on Artsy's success in future roles. My knowledge, and these lessons, reflect my experience within engineering team and not the company as a whole.

Hiring awesome people

Finding smart, motivated people who work well together is virtually impossible. If Artsy has a secret sauce, it is how it hires. All else falls from the assumption that they have hired the best people who want to work together to achieve Artsy’s mission.

Finding awesome people

The first step in hiring is defining the position. Artsy does this by crafting aspirational rather than clinical job descriptions. Within engineering, the job description focus on challenges, ownership, opportunities for growth and quality of team. In keeping with this high level approach, Artsy lists qualifications as fundamentals rather than 'x years doing y'.

Even with a great job posting, reaching engineers is challenging. At a recent engineering event, a CEO was walking around interrupting conversations to ask people if they were developers, and if so, pitched his company. In contrast, Artsy values the unique knowledge and contributions of individual engineers. It demonstrates these values through contributing to open source, blogging, speaking and targeted press. During this process, engineers begin to understand Artsy, and Artsy understands individual engineers. Rather than yelling 'developers!, developers! developers!' in crowded room, Artsy uses this knowledge to vet and make dedicated offers to individual engineers.

The takeaway is that the key to finding great people is making the hiring process about people. Artsy's process is in direct contrast with my current role where engineering applicants create an account on our hiring portal and fill out a 12 page form before we can contact them. This severely limits our applicant pool. Artsy keeps the barriers to 'first contact' low through online contact via OSS/blogging and by inviting candidates to a casual team event or coffee.

Evaluating awesome people

If you could choose between an doing interview or doing references which would you pick?

Artsy is very much in the 'references' camp but does continue to do interviews -- it just does them non-traditionally. Starting with a rigorous process for finding highly targeted people to interview enables a different interview process. Instead of a rigorous evaluation, Artsy uses the interview to answer questions about the candidates experience and then focuses on selling the candidate on the company.

This lack of focus on the interview, stems from the belief that it is impossible to tell good from great in an interview for creative or technical roles. The interview is a highly artificial environment that is unrelated to how the individual creates value day to day. While the environment is artificial, one-on-one are also reliable because they play on our biases. We often "value intuition over historical data" and "value people who are similar to us" much more than we should. We are also influenced by quick and superficial qualities such as attractiveness, race or background which have nothing to do with the candidate's ability to perform. Some companies change their interview process to fix all of these issues. Artsy solves this by acknowledging these biases, communicating them and focuses on references.

Artsy's secret weapon in evaluating candidates is doing references in a thorough way. Artsy believes that 'references are not a defense against hiring poorly, they are a way to hire great people'. There are a variety of techniques for improving reference calls. In general, the goal is to see questions like "What are your general thoughts on x's performance?" and "How was working with x?" as warmup questions rather than the meat of the call. The call can focus on more substantive questions where you ask the reference to compare 'x' against other people. For example, 'Are they the best person you've worked with in that role?', 'Why are they not the best?'. Ask about what get the person excited and what makes them happy. Just like when seeking out individuals, reference calls are about understanding who they are as an individual and whether they will be able to perform at a high level in your organization.

Just like interviews, references have many biases and issues. Consider the strength of the reference and the general strength of employees from that company. Ideally someone experienced in the industry, with has a wide network tied into many similar companies, should do the reference call. After doing many interviews and references, you will understand which organizations have consistently awesome people and which don't. You will also build up a network of people you can ask for a second opinion on a candidate.

Closing on hiring awesome people

The closing process is just as high touch as you've been all along. If you either make an offer or reject the candidate, do so thoughtfully by incorporating feedback from references and interviews so the individual can improve. Deciding whether to make an offer is always difficult but is really asking 'Is this someone you would actively seek to work with?'

During the closing process, salary negotiations will come up. The negotiation is not an opportunity to try to get the most out of a person for the least amount of money. Salary negotiators are unfair and inconsistent with any organization that values actual human beings. As Artsy states, 'compensation is a function of the value of the candidate based on market rate, the value the company brings to the candidate and the market in between'. For figuring out market rate, dB., the CTO of Artsy, recommends responding to recruiter spam to ask about comp. They have also published the comp framework here.

When closing, consider other factors that may go into them accepting the offer. Don't hesitate to bring in employees, references or investors to support making the hire. If they accept, send a something personalized and anything you think might generally interest the candidate errr… new hire - Congrats!

Takeaways

After moving on to a different company, I've discovered just how challenging it is to enact a similar hiring process. You can't pick and choose various convenient or 'easy' parts of Artsy's hiring process, and just apply them. Each step builds on the previous one - such a with the closing process. During a thorough reference call, you often learn information that helps you make the hire when the time comes.

Another challenge with implementing this is making the time. It takes an extraordinary amount of effort to create a human centered hiring process. Creating a little hiring portal and posting on LinkedIn is easy but, it doesn't work. At my current company, we mostly race to do things that needed doing yesterday. We don't yet have a product process where we plan features (say, in sprints) and execute them. Without proper planning, it is near impossible to set aside time to mine our network, thoughtfully pursue and meet with the kind of people we want to hire.

I've also gained a greater appreciation for the network effects of Artsy's hiring process. By actually understanding and caring about people, the process generates a lot of good will even when it doesn't work out. That good will pays off in terms of valuable referrals. Without this network, contacting people is more challenging. In my new role where we yet to develop this network, we also have yet to open source anything or create any kind of blog. This limits our exposure to engineers. In addition to these challenges online, we work in deep Sunset Park, Brooklyn. We are over an hour from anyone working in Manhattan and far from community events. While this may sound like a ton of challenges, by using this model, we at least have identified a clear goal and the problems in our way.

In summary, if you want to hire awesome engineers, put in the work. Demonstrate that you care about and understand the value engineers provide.

Read More
13th February 2017

This is a quick writeup of how I was able to geocode apartment sales in New York and then associate each sale with a neighborhood. The raw sales data is on the department of Finance website here and the neighborhood tabulation areas are here. There are some caveats with the data. For example, out of 487,874 residential sales only 223,529 were both individual units and had enough informate to calculate a sane price per square foot. A significant portion of the missing sales do not have a sale price, indicating it was a transfer of ownership 'without a cash consideration' possibly from parent to child. Another excluded portion are residential lots with small structures on them.

If you would like to download the dataset. It can be found here. I have visualized this dataset a bit here. This post mostly serves to document how the dataset was created if there are any issues.

Gather

First, download each individual year of the apartment sales data in Excel format from the NYC Department of Finance website. In this example, we will be combining all available years of data with the rolling sales data to create a dataset of sales from the beginning of 2003 to the end of 2014.

Combining them is not that difficult since they are formatted the same. I was able to combine them by simply dragging them all into Google Refine at the same time.

After getting the desired data into one Google Refine project, export the project into a CSV and then to JSON for easier working in Node land.

Clean

You should now have a file of sales in json format. Next, run coffee components/datautil/cleanup_nyc_sales.coffee. This will make the key names more javaScriptFriendly and trim whitespace from the end of lines.

Geocode

I tried to geocode the sales data using the PLUTO dataset. Sadly, because the sales happen over the past 11 years, many don't exist in the current PLUTO dataset. This will merely geocode 242,473 out of 318,713 sales. I then tried to fill out the remaining sales data using the NYC api, simply making API requests for each of the sales missing geo data. This relies of the assumption that NYC has agreed on a projection of the earth, but alas they have not. This resulted in a useless dataset with two Brooklyns a couple hundred miles apart. Not ideal! So what do we do? We use the NYC API for the whole thing.

First we generate a list of BBL's missing lat-long coffee components/datautil/bbl-missing-lat-long.coffee. And then run foreman run coffee components/datautil/nycapi-fetch.coffee. I tried to be nice and not parallelize requests to the API so this script takes a full day to run.

Now that we have a set of geocoded sales, we can import them into QGIS. This will display the sales in a nice brooklyn-esq shape but without much else. Next up is to add add neighborhoods.

Merge

This time we will try to merge the neighborhood tabulation areas with our geocoded sales dataset in QGIS.

The main issue here is that our geocoded sales and the NTAs use a different projection. Through messing around with CRS transformations in QGIS I was unable to get the two datasets to overlap. For some reason, if you import the neighborhood data into CartoDB and then re-export it, the resulting data is mergeable with the sales data. I was able to get them to overlap perfectly by setting both sales dataset and the neighborhoods dataset to the WGS 84 projection.

The next step is to merge them. QGIS has a nice function for this under Vector -> Data management tools -> Join Attributes by location. It may sound like your laptop is trying to lift off and become a drone, but be patient. After a little while, you will end up with a dataset merging the two. Sadly, this moves every column from the neighborhood dataset on to each of the individual sales making for an enormous dataset.

Export

Before exporting this shapefile, use the Table Manager plugin for QGIS to remove all of the neighborhood columns except the ID and name if needed. All we want to do is to be able to associate the individual sale with the neighborhood. Now export and you will have geocoded NYC sales data grouped by NTA.

Read More
13th February 2017

Today we're happy to announce we've open sourced the entire Artsy.net web app, Force.

Over the past few months, we've rewritten our web front-end to move off Rails and on to a Node.js stack that shares Javascript code and rendering between the server and client, otherwise known as Isomorphic Javascript. After migrating to this new stack, we open-sourced our boilerplate, Ezel, and have now gone a step further and open sourced Artsy.net.

Isomorphic vs Monolithic

Our transition to an isomorphic Javascript stack has been very successful albeit with some speed bumps. If you're interested in the details we've written a blog post, given a talk at Node on the Road (slides here), and another more extensive talk at this meetup.

The short story is that we moved from a monolithic rails app to a couple of Node servers on Heroku. This vastly improved the performance of our site and our own development speed. Using the patters in Ezel, we are able to tailor assets packages to specific pages and render some of the page on the server. This cut our page-load in half (from 6.5 seconds to under 3 seconds) and our tests take about 5 minutes (down from around 5 hours!) with little reduction in coverage. Performance numbers aside, our real win was dramatically improved development speed due to some architecture decisions we made.

Modularity

One of the biggest takeaways from the transition is the pleasure of modularity. By breaking our project up into smaller reusable pieces such as apps & components we make it easier to experiment, test, and refactor with confidence knowing our code is encapsulated into clearly defined pieces.

For instance, we recently redesigned our about page. To gradually introduce the new page, we simply started a new about2 app along side our old about app which you can see a little back in Force's history. This let us push code into the new about2 app with confidence it wasn't touching other parts of the stack. When it was time to ship it, we simply deleted the old about app folder and search and replaced "about2" to "about". There was no need to dig around various stylesheets, views, etc. folders looking for places where code for the old about page might still live.

Components are particularly useful for re-usability. For instance building this gene page (source code here) was mostly a matter of pulling in various components like a follow button, a filter component, this artist fill-width layout, etc. Because the CSS for those components are clearly self-contained it's easy to build up a small asset package that uses only the minimal CSS needed which you can see here.

We're so convinced this encapsulation is important that we've updated Ezel to use app/component-level public folders by default so you can even modularize static assets, like images, and keep them coupled with their respective apps/components.

Open Source by Default

Even though Force isn't a library, we have open-soured many of its components and libraries. Before open sourcing Force, we open sourced app-specific modules such as these backbone mixins this Artsy API authentication library, or this module we use to cache server-side Backbone requests.

Open-sourcing Force was pretty straightforward but we needed to make our sensitive keys/secrets private while not complicating development. To do this we wrote a .env file and uploaded it as a private gist that gets downloaded when setting up the app. We wanted to spread this open-source-by-default culture so we decided to update Ezel's configuration to be able to use a .env file in this way as well. This makes it easy keep your sensitive configuration data private while allowing the rest of your app code to be open source. You can read more about this in Ezel's Build Scripts & Configuration docs.

Spreading The Love

Force serves as an example of how we structured a large Ezel project and contains the full commit history of its construction. Unfortunately, due to image licensing issues, we cannot open up the Artsy API and therefore this repository can't serve as a runnable clone of our website. However, we will continue to merge our production code into it. If you have any questions feel free to hit us up on twitter: @craigspaeth, @dzucconi, @zamiang.

We're excited to continue pushing open source at Artsy. For more exciting open source projects take a look at our Github profile.

Written with Craig Spaeth

Read More
13th February 2017

People talk a lot about "culture fit" and hiring "people you want to hang out with". Identifying engineers who are motivated, knowledgeable and good at working on a team is already difficult enough, why further complicate the hiring process with these abstract metrics.

Lets hire people to do what they love doing. Maybe instead of thinking about jobs in terms of technical expertise, like 'front end', 'back end', 'systems', we can talk about what gets people excited.

In today's (relatively) high level programming world, engineers should have a level of comfort across the stack. Once patterns and code standards are somewhat well established, a 'css person' should feel comfortable editing the database models or working on an API and submitting a pull request (if not, your team or codebase has much bigger problems).

We have high level programming languages, why not give people high-level job titles?


Here are a couple of examples of 'high-level' job titles that come to mind:

  • Tooler: building tools to make the jobs people more effective at what they do. This spans the the whole company. You just really love using technology to make people's jobs more efficient.

  • Growth-hacker: while this title is certainly 'in vogue', it is a perfect example of a high-level job title. Your job spans product and engineering, you do whatever it takes to encourage 'growth', however that is defined for your project.

  • User focused: people who love buildings stuff that people use. I'd probably fall in to this category, basically you are someone who will do whatever it takes to get a great product in front of the people.

  • The academic, some people love correctness and broad sweeping design patterns. These are the people who love designing frameworks and APIs for other engineers to work with.

  • The tuner, you love making stuff fast. Performance is a very important feature with 'on the go', mobile lifestyles and the global marketplace. Whatever the task, you want to make sure it doesn't take very long.

While these names are pretty off the cuff, at least lets acknowledge that great things happen when you get smart, passionate people together with diverse perspectives. You can put together that team more effectively if you understand not only the different technical specialties of programming, but the different types of programmers.

Job titles at Artsy

I am one of about 15 engineers at Artsy.net. Everyone is just an engineer here and some people 'own' various technical components but even that gets murky sometimes as people go through their performance cycles. People generally work on what they want to work on. By having such an open ended working process, we've seen these types of roles emerge. As the team grows, if we need to formalize roles, it would feel unnatural to give people titles like 'lead front-end engineer'. I don't know what that means for us. We have an gallery CMS, a mobile site, a desktop site and internal tools, each with their own 'front-end'. As a company, we think in terms 'KPI's we want to keep in line and features we want to build for our users. Roles and responsibilities work best if company priorities align with who is going to 'own' that project. This limits the needs for heavy handed project management and allows our engineers to simply do what they love while happening to build a great product.

Cheers to 'high level' programming titles. This is really just an idea at this point, I'm sure it has its own set of issues. Please let me know if you have tried something similar.

Read More
13th February 2017

We all have stories, as engineers, of fixing some crazy thing at the last minute right before the demo goes up. We have all encountered situations where we needed to fix something that was our fault and we needed to fix it now.

This story is something that I think about in those times to remember to stay calm. No last minute fix could ever be this dramatic or important.

My grandfather passed away about a week ago. At the service, I was asked to say a few words and read from his memoirs. This was my choice.


RED TEAM 4 TO THE POD

The first unmanned launch of a Saturn V on November 9th. 1967. From the personal memoirs and the pen of William E. Moore January 28th. 1994.

There was five of us Rocket Scientists lounging around the ready room listening to the Apollo 4 Countdown on loud speakers and headsets. We were members of the Red Team Group and we were the Electrical Systems experts on all hardware interfaces between the firing room and the Saturn V vehicle three miles away. Our ears were now being drawn into a developing situation happening on the net. No response was received from an electrical circuit that controlled the separation of the S-II Stage from the S-1C Stage in flight.

"That was one of my electrical circuits!"

It just so happened that circuit is controlled by a series of relays located almost directly beneath that cold beast that was spewing out all kinds of funny colored, very cold gases -- the Saturn V rocket. We took a look at our blue prints and found the relay that must be the problem and called for a recycle in the countdown to a point where we could cycle the switch on the electrical networks console to see if the relay would pick up -- that was a "no go". Now things got serious. The NASA Test Conductor was talking 'scrub the launch' but our S-II Stage Test Conductor was talking 'go to the pad'.


Well, the Red Phone rings.

"Bill, how sure are you that this relay is the problem? Are we going to send people to the pad to rewire the rocket and not be able to launch because we guessed wrong?" said "AC" Filbert C. Martin

"It's worth a shot, the signal is not reaching the vehicle and that relay module is the only active component between the Firing Room Console and the Vehicle. You snap out the old Relay Module and snap in the new one and we will be able to tell if that was the problem a few seconds later."

"Well, we are a little concerned about sending a team to the pad with a fully loaded vehicle. We thought your team would do a lot of blueprint trouble shooting -- I'm not sure we planned to actually send anybody out to a fueled vehicle"

"Just don't let them launch this mother till we are at least half way back from the pad -- OK!"

About thirty minutes later the five of us (Bob Kelso NRR Sr. Tech, Bill Moore; NAR Engineer/ Team Leader, the NASA Safety Engineer, the NRR Quality Control and the NASA Pad Leader) got the official word to head for the Launch Pad with our new Relay Pod. It was 11:30pm. It was a dark, slow, three mile trip. As we got closer to the Saturn V it was shrouded in a white cloud of venting gases which relieved the pressures building up inside the vehicle fuel tanks.

Our goal was to enter this two level hermetically sealed, all welded steel coffin called the Mobil Launcher Base topped by a fully loaded 363 ft. high Saturn V, weighing 6.2 million pounds, and the permanently attached 380 ft. high Umbilical Tower, weighing 500k pounds. We finally stopped and left our van to walk up and into the second level of the Mobile Launcher Base. About this time, it came to my mind that during one of our training sessions we were told that one of the fully fueled prototype S-II rocket stages had been exploded out in the desert. The results showed that all buildings better be at least three miles from the launch pads - which they are. We were now within 25 feet of this 363ft tall bomb that sounded like it's giant fuse had been lit, and we were soon going to get much closer.

The Saturn V was more noisy and ghostly than I had ever expected and it had grown much taller and certainly more threatening since last week. The venting fuel made loud hissing sounds when relief valves popped or opened up suddenly. It was very easy to let your imagination infect your brain. This is a very dangerous place and everything seems to be moving in the heavy foggy mist. There was no way to talk to each other, heck, we could barely see each other and...we hadn't thought of this problem so we held onto each others yellow protective clothing like kindergartners crossing the street. We all wore safety helmets but they just did not make you feel like you were really safe.

As we climbed up the last step prior to opening the sealed submarine type entry door that led into the second level. We slowly opened the heavy steel hatch-type pressurized door it was like stepping into the jaws of a huge steaming dragon. The nitrogen fog, used to suppress fire, and the dim red glow from the emergency lights of level A made it look like a hollywood swamp scene. We started making our way through the 21 compartments to find our Relay Rack as the noise took on a more penetrating tone that seemed to bounce from wall to wall.

The smell became a mixture of kerosene with a mild touch of burnt paint and rubber. I was glad that the astronauts did not take this path to go aboard the Saturn V because my goosebumps were changing to a weird color of purple. With the realization that this was a much worse place to be trapped in, the team moved more rapidly to the relay rack. We replaced the old relay module and then had to cycle the switch on the firing room console. We then checked that the relay kicked in and that the signal was picked up on the vehicle. We resealed the cabinet, signed off on all the paperwork and got the out of there without any more sight seeing.

The drive back to the ready room very was fast and uneventful. The five of us were like stone figures, thinking about where we had been and what we had just accomplished. What could have happened and didn't. All of this without ever realizing that this experience was as close to being in the shoes of an Saturn V astronaut as any of us would ever be again.


In later letters, my grandfather mentions how fortunate he really was, having growing up a farm boy in West Virginia to have not just once in a lifetime experiences, but really once in many lifetimes experiences. The service was about celebrating his life, and this seems like one of those incredibly unique events that really does celebrate his life, both in terms of how he handled a mind bogglingly stressful situation and how he tells it the comfortably detailed and slyly humorous ease that was so characteristic of how he spoke.

A really incredible man who really contributed a lot to the world around him and meant a lot to those close to him, he will be sorely missed.

Read More
13th February 2017

This post details the first of many challenges we faced in 3D transforming the homepage of Artsy (inspired by Meny): detecting CSS 3D transform support.

Front-end development is messy in today's fragmented world. At Artsy, our goal is to do what it takes to provide an incredible experience for all of our users (IE8+, iOS and the usual suspects). Deploying bleeding edge tech, like CSS 3D transforms, is an exercise in compromising principals for practicality -- and managing these "compromises" in well-documented code.

We looked to Modernizr's feature detection approach to provide us with a reliable way to detect CSS3 3D transform support across browsers. They have some well- documented struggles around the issue. After flipping most of the tables in the office ┻━┻ ︵ヽ (`Д´)ノ︵ ┻━┻ , we settled on user agent sniffing as the most robust method for detecting CSS3 3D transform support. But why did none of the available methods work for us?

CSS3 3D transforms involve interaction between the browser and the graphics card. The browser may be able to parse the 3D declarations but may not be able to properly instruct the graphics card in how to render your page. There are many possible outcomes ranging from the page rendering with lines across it (Safari 4) to the page rendering beautifully then crashing the browser seconds later (Safari on iOS4). Any 'feature detection' approach would unacceptably flag these as 'supports CSS3 3D transforms'. This is one case where 'feature detection' fails and user agent sniffing (and lots of testing) wins hands down.

Most feature detection assumes a "supports" or "does not support" binary. This is not the case with CSS3 3D transforms -- there is a "gradient of support". Additionally, enabling 3D transforms causes the page to be re-rendered in an entirely different rendering engine which then causes other problems (more on this in a later post).

CSS3 3D transform support can be separated into 4 levels:

  1. Reliably supports 3D transforms across most machines. For example: Safari 6.
  2. Can parse and apply 3D transform declarations but ignores the 3D parts. For example: Chrome on a Retina MacBook Pro.
  3. Can parse and apply 3D transform declarations but renders in unacceptable ways. For example: Safari 4 and Safari 4/5 on Windows show lines across the page.
  4. Cannot apply 3D transform declarations in any way. For example: IE or Firefox < v10.

Here are a few popular ways of detecting CSS3 3D transform support and why they don't work for us:

Meny / Hakim's method

{% highlight coffeescript %}

apply these styles to the body in css then see if they are applied in JS

docStyle = document.body.style supports3DTransforms = 'WebkitPerspective' in docStyle or 'MozPerspective' in docStyle or 'msPerspective' in docStyle or 'OPerspective' in docStyle or 'perspective' in docStyle {% endhighlight %} This works best and is straightforward code. The only issue is that it throws a positive for iOS4 causing the browser to crash and a positive for Safari on Windows and Safari 4 OSX which both display a grid over the page when using the 3D renderer.

iScroll4 method

{% highlight coffeescript %} has3D = -> 'WebKitCSSMatrix' in window && 'm11' in new WebKitCSSMatrix() {% endhighlight %} This only works reliably Safari in our testing.

Modernizer method

{% highlight coffeescript %} ret = !!testPropsAll('perspective') if ( ret and 'webkitPerspective' in docElement.style )

create a dib and see if it moves

injectElementWithStyles('@media (transform-3D), (-webkit-transform-3D){#modernizr{left:9px;position:absolute;height:3px;}}', (node, rule) -> ret = node.offsetLeft === 9 && node.offsetHeight === 3; {% endhighlight %}

This creates a div, transforms it, and then checks if it's position has changed as expected. It only works in reliably in Safari. It sometimes works in Chrome but throws a false positive in the case of Chrome on Retina MacBook Pro as the element does move -- just not in 3D space.

User Agent method

We want to maintain wide support of new tech while ensuring all users have a great experience. Modernizr and the feature detection group have their heart in the right place and do a great job most of the time. That said, user agent sniffing is the only way to handle the complex support scenarios inherent in bleeding edge CSS3 tech such as 3D transforms.

Here is our method/hack for identifying browsers that support CSS3 3D transforms well:

{% highlight coffeescript %} (-> docElement = document.documentElement uagent = navigator.userAgent.toLowerCase()

browsers = [ ['webkit', 530] # not well supported in Safari 4, Safari 5 webkit version is 530.17 ['chrome', 12] ['mozilla', 10] ['opera', Infinity] # not supported ['msie', Infinity] ] # not supported

From: http://api.jquery.com/jQuery.browser

uaMatch = (ua) -> match = /(chrome) \//.exec(ua) or /(webkit) \//.exec(ua) or /(opera)(?:.version|) \//.exec(ua) or /(msie) ([\w.]+)/.exec(ua) or ua.indexOf("compatible") < 0 and /(mozilla)(?:.? rv:([\w.]+)|)/.exec( ua ) or [] { browser: (match[ 1 ] or ""), version: (match[2]?.split('.')[0] or 0) }

addNo3DTransform = -> docElement.className = docElement.className.replace 'csstransforms3D', '' docElement.className += ' no-csstransforms3D'

add3DTransform = -> docElement.className = docElement.className.replace 'no-csstransforms3D', '' docElement.className += ' csstransforms3D'

default to no CSS3 3D transform support

addNo3DTransform()

match = uaMatch uagent for browser in browsers if browser[0] == match.browser if match.version >= browser[1] add3DTransform() else addNo3DTransform() break

IS_IPHONE = uagent.search('iphone') > -1 or uagent.search('ipod') > -1 IS_IPAD = uagent.search('ipad') > -1 IS_IOS = IS_IPHONE or IS_IPAD

iOS 6 is our support cut off for iPad

match = /\os ([0-9]+)/.exec uagent IS_LT_IOS6 = match and match[1] and Number(match[1]) < 6

3D transforms are supported but do not work well on iPhone

if IS_IPHONE addNo3DTransform()

disable 3D transform for older versions of Safari on iPad

else if IS_IPAD and IS_LT_IOS6 addNo3DTransform()

deactivate 3D transform for Safari on Windows

else if navigator.userAgent.search('Safari') > -1 and navigator.userAgent.search('Windows') > -1 addNo3DTransform() )() {% endhighlight %}

If you would like to take issue with or improve this code please check it out on Github.

Read More

Written By

Brennan Moore

I'm a product engineer based in NYC. I'm passionate about building innovative digital products people love.

Follow me on Twitter here.