Interviewing With GitLab (GTLB)

Published in

HackerNoon.com

17 min readApr 24, 2019

Lesson Learned: You Have to be Born Into It

Update! Gitlab graciously reached out to me about this interview and provided feedback from the team via telephone. You can read about that feedback at the end of the article.

Update 2! In spite of promises in the Handbook to provide reasonable interviews, feedback about the interview, and fair interviews it appears the true reason for all of the struggles below was so that Daniel Peric could backfill his own position with Nicki Peric so that he could move on to a Technical Account Manager role. You read that correctly — the team’s choice “to make them successful” was, oddly enough, members of their own family. GitLab crutches Hanlon’s Razor (“Do not attribute to malice what can be explained by mistakes”) to insist that a convenient series of mistakes lead to a scenario where someone’s family member was hired over more qualified candidates.

GitLab is one of my favorite projects. I have nothing but praise for the product and team. Whenever possible, I try to introduce fellow technologists and even companies I do work for to just how easy it is to get Gitlab set up, accepting code, and firing CI/CD tasks off. Watching teams convert over to driving .gitlab-ci.yml tips to each other while watching the pipelines all show up green across distributed runners, all set up in an afternoon, is addictive.

More than a few times I have had the responsibility of the installation and all care and feeding of GitLab. Without fail, their Chef based installation and configuration management always works for me. Upgrades are a breeze, appearing every 22nd of the month, without one ever leaving me in some sort of unrecoverable state with an obscure error. The same can’t be said about many other software products I have had to attend to during my professional career.

When I saw the following tweet I jumped at the opportunity.

Of course I would want to be an Implementation Engineer!

Round 1, Early Assessment

The first part of applying involved being emailed a small questionnaire. Most of the questions were essay style questions about how you would help a customer, in what scenarios you might recommend Virtual Machines or Kubernetes, and a somewhat murky question about HA. The final part of the questionnaire stated: “Demonstrate your ability to install omnibus GL on AWS or cloud platform of choice”. It is intended that you submit all of this with their onboarding platform, with a rather small box for the lengthy amount I had typed, but what else is a Gitlab enthusiast to do except…

I prepared my answers, installed Omnibus, and then checked in my answers into the fresh GitLab install. I submitted a link to that project in the new GitLab installation. :)

At this step, since the language specified Omnibus, I assumed it meant that I should manually install GitLab. I’ve certainly done this countless times: within a few minutes I had put together my installation. I really wanted to use Google Managed Certificates and I thought that would also cover an HA question in the questionnaire. A basic GCE VM was configured to host the installation.

An example of my configuration would be something like this:

registry_nginx['listen_port'] = 80
registry_nginx['listen_https'] = falsegitlab_rails['db_adapter'] = "postgresql"
gitlab_rails['db_encoding'] = "unicode"
gitlab_rails['db_database'] = "gitlab"
gitlab_rails['db_username'] = "gitlab"
gitlab_rails['db_password'] = "supersecuregitlabpassword"
gitlab_rails['db_host'] = "10.26.241.3"

TLS was terminated at the Google Load Balancers with Google Managed Certificates. All of this was covered in the README.md of the only project this installation housed — the same project linked in the response.

I would normally manage something like this with Chef, Puppet, or Ansible, but went with manual to make sure I wasn’t submitting something unnecessary.

Side note: just in case the reader hasn’t done this before… the Omnibus installer is Chef! It’s very simple to update configuration and I can’t think of a single time that the Omnibus Chef built-in management has ever failed on me.

Round 2, Interviews

I completed an interview with the HR representative. The conversation was enjoyable and engaging. We coordinated another interview with the hiring manager.

The interview with the hiring manager was great, as well. The conversation was a lot more technical than the previous interview, but two technologists quickly fall into a swing of talking about the thing we love — technology!

Round 3, Tech Demo (Part A: Preparation)

Unfortunately, this is where everything fell apart. I had received previous instruction to install GitLab on a Cloud Provider (again) except this time it was explicitly stated that it should be installed with IaC (Infrastructure-as-Code) disciplines. No problem!

I had a week to prepare. During that week, I decided to see how the work around installing GitLab to Kubernetes was going. I had been keeping an eye on the Gitlab Cloud Native Helm Chart and could not wait to use it. To be honest, my first few attempts were a smattering of nginx ingress controller configuration but I quickly learned it was unnecessary — as I should’ve expected, the GitLab Helm Chart works just as well as the GitLab Omnibus Installer and does nearly everything for you with very convenient configuration. The documentation extensively covers use-cases like using an external database. The Helm chart is really great but does suggest that Geo and Pages are not intended to be used with it (yet) because of a few technology challenges the team is working through.

At the end of last year, Google announced the Private IP feature for Cloud SQL databases. Paired with the Google Shared Services API it is now extremely easy to automatically add Cloud SQL databases to your projects. Previously, there was generally some work with Cloud SQL Proxy — often cumbersome and, since you were routing out of your project and back in, it would be reasonable to assume there is higher latency, traffic cost, and security topics. I used these features with my installation to provide a High Availability PostgreSQL Cloud SQL Instance (that’s a mouth full) to the Gitlab installation.

All of this was driven by HashiCorp’s Terraform. A link to the project itself with an extensive README.md was sent to the interview team. Actually, two links were sent — the project mirrored on gitlab.com and also the project checked into an installation created by all of the work above.

I used some great new features, had an opportunity to work with the GitLab Helm Chart, and had a bulletproof deployment. I was all set!

You can find the project here if you’re interested: https://gitlab.com/e_snyder/gitlab-setup

Round 3, Tech Demo (Part B: Things Go Sideways)

A reader might assume that the heading of this section suggests my terraform didn’t work. They would be completely wrong.

The first mishap involved the actual hiring manager suddenly becoming unavailable to participate in the tech demo. The tech demo was re-coordinated with an interview team sans the hiring manager. This should have been red flag #1 for me.

On the day of the tech demo, I first asked if anyone had a chance to look at the project. There was quite an extensive README.md included. Inside of that README.md I outlined the decisions I had made (Private IP, Service Networking, Helm Chart, etc). I also included an example Run Book that included all required variables (locally, I had these in a `terraform.tfvars` file — a pretty common pattern).

Unfortunately, I didn’t really receive any assertions that the team had read the README.md before the tech demo. That README.md also included a small section mentioning that the terraform included creates a brand new project — which sometimes complicates first runs since there are no APIs enabled. This is important later.

I waited with my screen shared and poked around the existing installation’s UI to prove it worked. To get rolling, I was asked to destroy the project. Not a problem! I stated that it would most likely not destroy on the first try because of the helm provider, hefty GitLab install, and complete project destruction.

The team was surprised. They asked “Why wouldn’t it work?” and I did my best to answer that destroying a Kubernetes namespace, disabling the API, and other lengthy tasks often take time to complete and terraform would most likely error. I also explained that an asymmetrical run book that destroys just the project would instantly destroy the deployment. Alternatively, we could create a new project (all project resources had the “pet names” terraform resource for randomized naming — this meant we could create a new project at any time with no conflict or overlap). I don’t think they understood any of these options.

“Well, let’s run the destroy, anyway,” they said and, as expected, we stumbled into a few errors including the namespace pending deletion. I used terraform state rm 3 or 4 times to clear out the resources-in-limbo. The destroy otherwise worked fine and I took the team on a quick tour through the GCP console to prove it and explain the terraform commands.

Red Flag Number 2 — Google APIs

At this time I had a suspicion. I asked if “most other people use Ansible for this” and the team seemed to agree. It is worth noting that an installation with terraform that hands off to Ansible would only need to power down a GCE VM. There would be no API (de)provisioning or waiting for containers to stop.

For me, this would be red flag #2 — I had chosen a deployment method that, apparently, the team was unfamiliar with. In hindsight I would have created a project with all the necessary API and simply added/removed GitLab from an existing Kubernetes cluster rather than demonstrating a “from zero to GitLab” approach of starting from nothing had I known the team was only familiar with single VM installations.

With the project destroyed we continued forward. I warned again that we may run into a scenario where we have an API that’s slow to provision. We didn’t have this happen during the installation. The installation worked as expected without a hitch.

For those not familiar with terraform and GCP, here’s what it looks like when the API doesn’t quite make it online before the project attempts to use it:

* google_dns_managed_zone.gitlab_zone: 1 error(s) occurred:* google_dns_managed_zone.gitlab_zone: Error creating ManagedZone: googleapi: Error 403: Google Cloud DNS API has not been used in project PROJECTNUMBER before or it is disabled. Enable it by visiting https://console.developers.google.com/apis/api/dns.googleapis.com/overview?project=PROJECTNUMBER then retry. If you enabled this API recently, wait a few minutes for the action to propagate to our systems and retry., accessNotConfigured

That little stanza is most likely extremely familiar to people using terraform with GCP. The solution is to follow exactly what the error message tells you to do — wait a few minutes for the action to propagate and retry.

These API operations are prone to hiccups with terraform, and the terraform team has done great work to wait on Google to report the API is available. The Google API does not wait until the API is 100% ready across everyone region, zone, and system. Refer again to the error that Google will give you even after the API reports it is ready.

Red Flag #3 and #4, terraform syntax

As mentioned, I had a terraform.tfvars file locally that was not in my repo. Through my IDE I was asked to click on each file in my project and when I got to the tfvars file I was asked, “What is that file? It isn’t terraform.”

“These are my variables that include my project billing, database secrets, and other items I don’t want to check in and is automatically loaded,” I responded, and cat’d out my .gitignore to show it was excluded.

“How are you loading that? Can you show me in the code where you load that file by name?” the interviewer asked.

Red Flag #3 comes in the form of having to explain several times that terraform.tfvars is automatically loaded by terraform to override variables. I cannot show in the code where it was loaded by name — it’s automatic. I am at a complete loss at what the interviewer was asking or if I managed to answer his question.

While the project was deploying we had some other conversations. The topic of “Are you familiar with AWS and terraform?” appeared. I am, and I mentioned that I felt like more AWS resources had nested blocks — which is a bit of an annoyance since you can’t iterate over nested blocks.

The interviewer stopped me abruptly. “Yes you can”, he said. I opened an Atom window to type out exactly what I meant.

resource "fakeresource" "myfakeresource" {
  config_block {
    ...
  }  config_block {
    ...
  }
}

Someone who has, for example, created firewall rules in GCP would be familiar with the pattern. Google Compute Security Policies also extensively use these nested blocks.

“You can’t iterate over these blocks”, I said. “You can,” the interviewer insisted. I explained terraform’s count only works on resources. He argued you could “pass an array” (not a clue what this was supposed to mean and it is not a thing). In the case of Google Compute Security Policies, that array can only contain 5 members — if you have ten rules, you have to repeat the block. “No, you don’t,” he said.

I don’t mean to call someone out, but the interviewer is completely wrong. There is no way to iterate over nested blocks like these in terraform 0.11. You can see that the feature is being implemented in terraform 0.12 here.

I did my best to not be rude or standoff-ish, but the interviewer insisted this was entirely possible. Politely, I changed subjects to get over Red Flag #4.

Red Flag #5, #6, #7, certificates

The terraform apply completing was the opportunity to escape from the nested block conversation above. With the GCP Kubernetes UI in the background, I mentioned we would have to wait for the certmanager processes to get us some certificates. This is completely expected and a normal part of the installation.

“Just visit it now”, the interviewer demanded. Since HSTS was used I could not visit this page without a certificate error you cannot bypass in Chrome. I tried explaining this. “Just push proceed” was the response. The HSTS error page has no proceed button when it is displayed. This is exactly what happened because we did not wait for the certificates to finish. There is no proceed. Once again, this is completely expected a normal part of the installation.

“You can proceed. Just proceed” was the persistent instructions. I insisted we wait while staring at a window with no proceed button and an HSTS error. It wouldn’t take that long for the certmanager process to complete. If he really wanted me to, I could probably get around the certificate related HSTS restriction with Firefox since I had not visited the HSTS enabled site with that browser. The interviewers (multiple, at this point) argued Chrome would work fine. It doesn’t. They could see my screen on the screen share and see there was no proceed dialog. This was red flag #5 as it seemed pretty condescending to suggest I was not using the browser correctly when their goal was to bypass the security mechanism of HSTS, GitLab, nginx, etc, and refused to acknowledge HSTS entirely.

Chrome included the message:

You cannot visit <site> right now because the website uses HSTS. Network errors and attacks are usually temporary, so this page will probably work later.

“If you can’t do it that way then visit it by IP”, the Daniel Peric suggested. I explained that there was no path in nginx and we would end up at the default backend for nginx. “That should work”, Daniel said. That won’t work. That can’t work. There is no Ingress that matches the IP. There is no nginx host that matches only the IP. Red Flag #6: You will end up at the default backend of the nginx ingress controller and that’s exactly what happened.

“Open it by name on port 80” the interviewer then stated. At this point, I sort of begged that we just wait for the certificates to finish — it wouldn’t take long. I explained that we aren’t going to be able to visit it on port 80 because HSTS was enabled and I had already visited the site (at the beginning of the tech demo) via HTTPS. “That’s weird. There’s something wrong with your installation. That should work.”

The interviewer was wrong. Again.

Red Flag #7 appears. What the interviewer suggests is a problem with my installation is a feature of the GitLab installation. Visiting by any combination of plain IP, or port 80, or IP:80, or hostname:80… That won’t work. That won’t ever work. That can’t work. Just a few minutes of patience from a Gitlab interviewer to let GitLab get GitLab’s certificates as designed by GitLab Engineers as part of the GitLab Cloud Native Helm Chart would allow us to visit it normally.

During all of this dismissal of my “broken” installation, another interviewer chimed in with “Hey, it loads now”.

Yes. Yes, it does. :)

Strangely, digging into the success of my installation went no further. The interviewers did not ask for accounts to log in with or ask me to visit any status page. For all they knew, ssh cloning was broken or pages beyond the login screen didn’t even work.

All of this, of course, does work in my installation. But why not verify?

I took the opportunity to ask the team if what I had done was not successful or seemed strange. “Most people use Ansible”.

Feedback

I didn’t actually receive any feedback. I got a generic rejection email.

I politely emailed the recruiter I had been working with asking for feedback. I received no answer. I emailed the hiring manager and the recruitment resource. I received no answer.

I was polite and requested that I receive feedback around the tech demo. Four days later, I received an email reply: “The opportunity seems to be around the installation of GitLab and providing being able to talk through questions with the hiring team”.

The Gitlab Handbook outlines a whole feedback process along with templates for responding to requests for feedback. The lack of transparency in a decision that appears completely arbitrary is only more painful when discovering the handbook procedures are not providing some closure as to how a perfectly working install is a failure.

The team didn’t even log in to the instance. For all I know, I failed because GitLab’s own certmanager process simply exceeded the patience of the interviewers.

I was floored. I’m still floored. I still haven’t received any feedback beyond the team deciding that I failed to deploy GitLab when I didn’t or that I failed to answer any of their questions appropriately. We didn’t have time for many questions — we were too busy being distracted by how terraform and TLS works.

I understand the role interacts with customers that may have the same opinions and concerns, but this was a tech demo!

Conclusion

I have outstanding respect for the work that everyone at Gitlab does. I have the utmost confidence that my interviewers are great technologists. It seems they are not at all familiar with Kubernetes and terraform which they demonstrated very clearly.

Unfortunately, it is extraordinarily painful to think I was dismissed from a position for a complete working demonstration of an installation that was littered with questions, not about the installation, but challenges to how terraform, TLS, and Google APIs work from inexperienced team members. I suspect, if questioned, the answers will be that I misunderstood, but there isn’t much wiggle-room in arguing over features that aren’t present in terraform 0.11 or that Chrome should work against HSTS because someone is impatient.

I think it was unfair that the hiring manager was not present.

To the next hopeful individual applying at GitLab, don’t forget the resounding wisdom in this article: Just use Ansible! :)

Feedback from GitLab!

A week after requesting feedback, which went unacknowledged for a week, I had a GitLabber reach out to me by telephone to better provide the feedback of the interviewing team. Interestingly, during that week, I had a GitLab Recruiting Manager tell me there was a problem with the tech demo, to which the Hiring Manager quickly replied with something about there not being a problem — it was a competitive position.

There was feedback provided by the team that was negative. The hiring manager lied. So much for the intent of the GitLab Handbook….

Since I didn’t receive it in writing I will do my best to transcribe it.

The GitLab Team Doesn’t Understand Infrastructure as Code (IaC) Discipline

One part of the feedback provided was that “The goals of the tech demo were misunderstood — there was already an existing instance of GitLab installed”.

IaC, summarized, is our ability to plan and implement infrastructure using the same disciplines developers have been able to use. We have the ability to use a tool like terraform to describe our expected state of any infrastructure, deploy it, test it, iterate upon it, update, and redeploy “from scratch” as desired. Operations should be reproducible and idempotent.

I brought a pre-existing GitLab installation to the tech demo to allow for questions about HA, DR, Cloud SQL, and other topics about how GitLab would be installed. I proved that I was able to destroy it, iterate on it as necessary, and redeploy it.

I can’t help but feel like I brought professional expectations that exceeded goals. It is unimaginable that an Implementation Engineer at a company like GitLab would be confused by this. It is unimaginable that an implementation team would not have these explicit goals in mind to deliver to customers.

The GitLab Team Didn’t Understand the Question They Asked

From the recollections above, the majority of questions were exactly as written and I don’t remember any single question that felt valid (“Show me in the code where you load that file [terraform.tfvars]” is not valid). I would gladly take the time to explain how “HSTS, or HTTP Strict Transport Security, prevents main-in-the-middle attacks between the browser and server by explicitly preventing downgrades from HTTPS to unprotected HTTP” to a customer. With a customer, we would have the opportunity to table this or do additional research if there were unanswered questions or concerns. We could research it together. An Implementation Engineer at GitLab should be intimate with this subject, especially one who has ever used the nginx ingress controller in Kubernetes which is included with the GitLab Cloud Native Helm Chart.

For the interviewer to declare the installation was broken because of HSTS presents no opportunity for inferring anything other than complete unfamiliarity between the interviewer and HSTS. Again, this is not something I would ever have been able to imagine.

A tech demo is not an appropriate place for an impromptu lesson on LetsEncrypt, ACME Challenges, and HSTS, but had I known that is what was necessary: I am completely willing to teach Gitlab Engineers what HSTS is and how it relates to the installation of Gitlab given the opportunity. It is critical to the operation of Gitlab and should be very clearly understood by anyone installing it for customers.

I was not asked questions about the Service Networking API or how it benefited the installation. I was not asked about High Availability. I was not asked why I had chosen things like a Cloud SQL instance and/or Kubernetes over VMs with PostgreSQL and Omnibus. I was not asked to explain HSTS.

The GitLab Team Lied About Questions I Asked

This one is very surprising.

The questions I did ask were around if installing from the Cloud Native Helm Chart wasn’t very common. I asked what other tech demos looked like. I asked if what I had done made sense. “Not very common”, “Yeah”, “Sure”, “Uh-huh”, “Ansible” were the answers. There was no conversation offered by the interviewers about the tech demo or overall presentation. With the unwillingness of the interviewers to discuss the actual tech demo… I kind of assumed we were done.

According to the listing for this position I was supposed to have another interview that wasn’t a tech demo (if I remember correctly, the tech demo was actually a “may” and not a “will”). I assumed that would be a more appropriate place for questions about GitLab values, how “fun” it is to work there, what contributions that person(s) had made to the Handbook, and other topics.

Updated Conclusion

I can’t help but assume that I was interviewed by a less experienced team with the hiring manager not present. You will not convince me that GitLab Professional Services do not understand HSTS, terraform, or Infrastructure-as-Code discipline.

My interviewers struggled with these concepts to the point of leaving feedback specifically antithetical to a successful GitLab installation. You will not convince me I somehow didn’t draw the luck of the B-team because this is not the standard of GitLab (well, update — we learned it was so Daniel Peric could hire Nicki Peric).

My original advice still stands, with a slight extension, to anyone with aspirations for the position I applied for: Know your audience — just use Ansible to stick GitLab Omnibus on one single VM with a single local postgresql database without TLS… and ask if everyone enjoyed morning surfing. This is a wildly inappropriate installation for any business and painfully counter-intuitive to any professional expectations, but I believe this is what would have been a successful tech demo in the eyes of the team that interviewed me.

Follow the link below for installation using an external Cloud SQL database with a private IP that you can make Highly Available. This is a far more reasonable starting point for any GitLab installation worthy of delivering to a customer — but make sure you scale up the database node(s) and turn off the preemptive node setting!

https://gitlab.com/e_snyder/gitlab-setup

I’m really glad I used a micro instance and preemptive nodes… the bill is going to be much cheaper for this lesson in over-achieving.