Skip to main content

8 posts tagged with "productivity"

View All Tags

· 3 min read
Alvaro Jose

When we talk about observability, we talk about:

Capability of developers to understand the health and status of their application.

We don't want users or clients to be the ones noticing something is wrong. For this, there are multiple tools that fall under the observability category.

Tools

Alarms

This is the first line of defense against issues, the intent is to get notified if any potential issue arises.
The intent of this is to provide a notification if any parameter of our application is out of range (ex. to many 5xx).

This allows us to use our mental bandwidth to focus in creating value and not continuously check if the parameters are in range.

This affect the next DORA 4 metrics:

  • ✔️ Mean Time To Recovery

Metrics

As the name says, this is a set of measurements we track from our code, it allows us to understand the health of individual parts of our system.

This metrics are shown in dashboards that allow us to visually understand what is happening. We can divide metrics dashboards in 2 types:

  • Status: It will give us a really fast overview of the health of the system.
  • Details: It will not tell us what is wrong, but will provide more detailed information to dig deeper into a specific area.

It's important to not mix this 2 together, as they have different purposes. Like with alarms, it helps focus our mental bandwidth in the correct place.

As you see in the previous image, the left represents a detail dashboard that makes it difficult to know on a single view if there is an issue. For this, as in the image on the right, we have a status dashboard that in a single glance we can spot where to look next.

This affect the next DORA 4 metrics:

  • ✔️ Mean Time To Recovery

Logs

This is the lower level you want to go. It should tell you where in the code is your issue, so you can go and fix it.

When thinking about logging, it is significant not log everything. Due to the added noise that this can bring.

This affect the next DORA 4 metrics:

  • ✔️ Mean Time To Recovery

Example

let's get practical on how would this work.

  • Implement your service
  • Create metrics and send them to your metrics system (ex. Datadog, Grafana)
  • Create logs and send them to your logging system (ex. Datadog, Kibana, CloudWatch).
  • Create dashboards:
    • Single Status dashboard. Use only simple boxes with green and red backgrounds that represent in one view the health of your system & subsystems.
    • Multiple Detail dashboards. Create a dashboard for each subsystem with as much data as necessary to understand where the issue is, so you can later pinpoint the root cause in your logs.
  • Create alarms based on the status dashboard boxes.
  • Connect your notification system (ex. Opsgenie, PagerDuty, Slack channel) to the created alarms, so you get push notifications as something goes wrong.

· 4 min read
Alvaro Jose

When we start our journey towards continuous integration & delivery, the first thing to take in count is the mentality. There are a few of them that will make or break our intent. Let's see the most important and also some practices.

Mentality

You build it, you run it

create a DevOps culture, not a Devs vs Ops

This mentality is the idea that the same people who develop the software re in charge to maintain it in good health by observing it.

For many years, this was not the case. Operations & development were handled by different teams. This caused a dystopian situation where each group had a different goal:

  • Devs: deliver as fast as possible. By pushing code to production without observing the side effects of it.
  • Ops: keep system stability.

With the 'you build it, you run it' mentality, devs focus on their service or work, while Ops becomes a product team that focus on providing the correct tooling for Developers.

This affect the next DORA 4 metrics:

  • ✔️ Deployment frequency
  • ✔️ Lead Time for change
  • ✔️ Mean Time To Recovery
  • ✔️ Change Failure Rate

Embrace Ownership in Failure Culture

the problem is not breaking things, is the inability to recover from it

Normally, developers feel they need a safety net to feel comfortable to introduce changes to production, this tends to translate in delegating the ownership to others trough peer review or other validation step.
This lack of ownership have massive effects on the capacity to recover and the gates that code needs to go through, affecting the feedback cycle.

To improve this failure culture is necessary to promote this behavior, having no blame reduces the amount of stress people go through.

If something fails is not an issue of the individual but of the process itself.

Imagine that every commit goes to production, changes will be so small that fixing or rolling back can be done in minutes or seconds. At the same time, developers will be able to create the correct tooling to feel more comfortable with this practice.

This affect the next DORA 4 metrics:

  • ✔️ Deployment frequency
  • ✔️ Lead Time for change
  • ✔️ Mean Time To Recovery
  • ✔️ Change Failure Rate

Be a Boy Scout

Don’t continue the same path if you think something can be done better

As individuals, need to bring change to our products. If we see any new practice, tool, services… that can support the work of the team, bring it forward. Don't shy away because the team is currently doing it.

This affect the next DORA 4 metrics:

  • ✔️ Deployment frequency
  • ✔️ Lead Time for change
  • ✔️ Mean Time To Recovery
  • ✔️ Change Failure Rate

Learn & Adapt

Not everything is solved in the same way, don't follow:

If your only tool is a hammer then every problem looks like a nail

For this, learn and take your time for it. When you have a new problem, as it's possible, you don't have the correct tool in your toolbox.

This affect the next DORA 4 metrics:

  • ✔️ Deployment frequency
  • ✔️ Lead Time for change
  • ✔️ Mean Time To Recovery
  • ✔️ Change Failure Rate

Practices

Firefighter Role

The firefighter role is a rotating role inside the team. They are responsible for being the first responder to incidents and helping solve them.
At the same time, to make sure this person does not suffer from cognitive load due to context switching, this person is not involved on the normal pair rotation and development tasks.
In exchange, they focus during the week in improving the specific tooling of the project (ex. DB migration tooling).

This affect the next DORA 4 metrics:

  • ✔️ Deployment frequency
  • ✔️ Lead Time for change
  • ✔️ Mean Time To Recovery
  • ✔️ Change Failure Rate

On Call Rotation

As the development team is also in charge of running the service, some of them will require after working hour support. On call is just this, the disposition of team members to take care of their services around the clock.
This tends to sound bad, but there are ways to not make this suck. I can't express it better than Chris Ford has already done in this page.

This affect the next DORA 4 metric:

  • ✔️ Mean Time To Recovery

Conclusion

These are the starting point to feel comfortable running things in production without the concern that any issue is a catastrophic thing. Failing is not an issue, the important part is to be able to recover as soon as possible from any problem that arises.

· 3 min read
Alvaro Jose

This is a series I am really looking forward to writing. I have been doing this presentation for the last 3 years in multiple places.

Am I Crazy?

The answer is no, most of the thing you will see on this series comes from practices derived from Extreme Programming, that show how to build quality and value into products. So bear with me for some time.

Motivation

A few years ago, I read the book Accelerate that is derived of the analysis of the state of DevOps report that happens in a regular basis.

The book does not speak only about technology but also speaks about communication, organization, etc. And how this affects effectiveness in teams & companies. I recommend reading the entire book.

4 key metrics

Nevertheless, most of the people resume this book (erroneously) in the next table.

It does a comparison on a what are called the 4 key metrics, and provide a classification of performance (teams & companies, since 2017 this classification has evolved).

What does these 4 key metrics mean:

  • Deployment frequency: is how often does the team deploy to production.
  • Lead Time for change: is how much time does a story take to get to production.
  • Mean Time To Recovery: is how fast can we solve a production issues.
  • Change Failure Rate: is how frequently do we break things in production.

All this metrics is helping teams understand their feedback cycle and stability. In the case of the team, I currently work with:

  • Deployment Frequency: once per commit to trunk (while doing trunk-based development) what ends up translating to a few times per day.
  • Lead Time for change: below 1h. We can activate a feature as soon as the code is deployed by the CI/CD using feature flags.
  • Mean Time To Recovery: In minutes. We can activate and deactivate feature flags on the fly if any of the code breaks, and we have a good observability and alarming, so we are the first one to notice.
  • Change Failure Rate: We don't optimize for this, as MTTR is more important for us (I will explain why later). Nevertheless, we currently only had 2 minor production issues in the last year, so we are way below 1%. Our CI/CD validations help a lot on this.

The intent of this series is to share the Extreme programming practices that we use to achieve being on the elite classification of DORA 4.

Note of Caution

As this twitter thread shows, this is not one size fits all, the challenges of a team are not the challenges of another one. There is no silver bullet or common root cause to the issue, and each team should use this metrics to track improvements in an unbiased way. For this, the 4 key metrics do not mean anything at company level and should not be used to compare teams.

Next

In the following installments, I will walk backwards from having something in production and how to keep it running in a healthy manner stress-free up to coding techniques that enable Trunk-based development.

· 4 min read
Alvaro Jose

In software development, over the last years we always talk about on cross-functional teams, as a good split of responsibilities to provide autonomy in teams. What does that mean? Why is this so? And what does it look like?

History & types of teams

It's probably easier to see the evolution of team culture as a chronology, as it has been an evolving thing.

Specialization-Based Teams

Traditionally, when we had only big monolithic applications, teams have been split by their expertise. This meaning all the quality assurance, Frontend, Backend roles will be in a team with their expertise-based peers. This might look like the next image:

What are the pros and cons of this model:

  • ✔️ Improve depth of knowledge from peers.
  • ✔️ No dependency on individuals, the Bus factor tends to be bigger than 1.
  • ❌ Bottlenecks in between teams, due to different priorities and timelines.
  • ❌ Lack of breath of knowledge.
  • ❌ Low domain expertise due to coverage of all domains.
  • ❌ Continuous context switch due to support of multiple domains.
  • ❌ Design issues due Conway's Law relation in between communication patterns and architecture.
  • ❌ Eventually, teams grow too big and have management issues due to Dunbar's Number on human relationships.

Specialized Cross-functional Teams

Due to the shortcomings of the previous model and the raise of microservices and some concepts from DDD, the intention of splitting teams was to make sure a specific domain and their solutions were cover by the same people.
This allows more independence and control over what is required to fulfill the needs of that domain.

This might look like the next image:

What are the pros and cons of this model:

  • ✔️ Common domain expertise, allowing faster and informed decisions.
  • ✔️ Single domain will not require a lot of context switch.
  • ✔️ Helps design on microservices environments due to Conway's Law.
  • ✔️ Teams tend to stay small and follow Dunbar's Number on human relationships (ex. Amazon 2 large pizza team size).
  • ❌ Bottlenecks in between team members, due to process dependency.
  • ❌ Lack of depth of knowledge from peers.
  • ❌ Lack of breath of knowledge being shared.
  • ❌ Bus factor tends to be small.

T-shaped Cross-Functional Teams

The previous organization helped many teams to be able to focus and do the right thing in the right moment.

Nevertheless, it lacked the focus on collaboration and support inside the team, as each person has their small set of responsibilities can easily cause bottlenecks inside a single team.

T-shaped development tries to solve this by making sure all team members can work in every part of the solution (represented by the horizontal part of the 'T'). Nevertheless, each member can have his own preferred field of expertise (represented by the vertical part of the 'T').
This has been enabled by the lower complexity on the tooling and entry-level learning curve to most of the expertises.

What are the pros and cons of this model:

  • ✔️ No bottlenecks as all team members can chip in to the different needs.
  • ✔️ Common domain expertise, allowing faster and informed decisions.
  • ✔️ Single domain will not require a lot of context switch.
  • ✔️ Helps design on microservices environments due to Conway's Law.
  • ✔️ Teams tend to stay small and follow Dunbar's Number on human relationships (ex. Amazon 2 large pizza team size).
  • ✔️ Shared tasks improve a team member depth of knowledge.
  • ✔️ Shared tasks improve a team member breath of knowledge.
  • ✔️ As knowledge is spread inside the team, the Bus Factor is not an issue.

Conclusion

Time has improved things for all teams, and we are probably not at the end of the transformation of teams. Nevertheless, it is good for companies and individuals to adapt to changes in the environment.

· One min read
Alvaro Jose

Video

Long Version

I am currently starting some new open-source projects and I feel it is tedious to do some recurrent tasks. For example:

  • Promote this blog post in social media
  • Announce a new release.

Power Automate & IFTTT integrations allow just this, by a process of action and reaction.

These systems provide:

  • Triggers: they are the starting point of what will happen after.
  • Actions: they react to previous steps on the described flow.

An example of this is the next flow:

image

image

  • In IFTTT, if a new feed item exists in the RSS of my blog, it's posted as a card in a Trello board.
  • The Power automate flow looks for new cards on that column.
  • Retrieve the content
  • post it into medium
  • Post into Twitter and LinkedIn about the new blog post.

As you can see, automation is cool and can save us a lot of effort by increasing our productivity.

· 5 min read
Alvaro Jose

Why are messages important?

Commit messages are part of the collaboration we do day to day inside a team, it works as a record of what has happened.

Every time you perform a commit, you’re recording a snapshot of your project that you can revert to or compare to later.

— Pro Git Book

Commit messages are used in many ways, including:

  • To help a future reader quickly understand what changed and why it changed
  • To assist with easily undoing specific changes
  • To prepare change notes or bump versions for a release

All three of these use cases require a clean and consistent commit message style.

Easy Commit messages with Commitizen

This tool purpose is to define a standard way of committing rules and communicating it. The reasoning behind it is that it is easier to read, and enforces writing descriptive commits. Removing the ambiguity of options and the mental load of following the standard manually.

Commitizen will prompt you a series of questions that will generate the final commit message. It has multiple adapters, in my case I prefer to be controlling the questions, so I use cz-format-extension.

You can add commitizen to your project with the next command line

npm install commitizen --save-dev # npm
yarn add commitizen -D # Yarn

Add any of the available adapters, in my case cz-format-extension:

    npm install cz-format-extension --save-dev # npm
yarn add cz-format-extension -D # Yarn

In your package.json you will need to add the next section:

  ...
"config": {
...
"commitizen": {
"path": "cz-format-extension"
}
}
...

The Adapter cz-format-extension allows a massive flexibility as the questions can be defined in a .czfrec.js file. An example is:

const { contributors } = require('./package.json')

module.exports = {
questions({inquirer}) {
return [
{
type: "list",
name: "type",
message: "'What is the type of this change:",
choices: [
{
type: "list",
name: "type",
message: "'What is the type of this change:",
choices: [
{
"name": "feat: A new feature",
"value": "feat"
},
{
"name": "fix: A bug fix",
"value": "fix"
},
{
"name": "docs: Documentation only changes",
"value": "docs"
},
...
]
},
{
type: 'list',
name: 'scope',
message: 'What is the scope of this change:',
choices: [
{
"name": "core: base system of the application",
"value": "core"
},
{
"name": "extensions: systems that are observed",
"value": "extensions"
},
{
"name": "tools: other things in the project",
"value": "tools"
},
]
},
{
type: 'input',
name: 'message',
message: "Write a short, imperative tense description of the change\n",
validate: (message) => message.length === 0 ? 'message is required' : true
},
{
type: 'input',
name: 'body',
message: 'Provide a longer description of the change: (press enter to skip)\n',
},
{
type: 'confirm',
name: 'isBreaking',
message: 'Are there any breaking changes?',
default: false
},
{
type: 'input',
name: 'breaking',
message: 'Describe the breaking changes:\n',
when: answers => answers.isBreaking
},
{
type: 'confirm',
name: 'isIssueAffected',
message: 'Does this change affect any open issues?',
default: false
},
{
type: 'input',
name: 'issues',
message: 'Add issue references:\n',
when: answers => answers.isIssueAffected,
default: undefined,
validate: (issues) => issues.length === 0 ? 'issues is required' : true
},
{
type: 'checkbox',
name: 'coauthors',
message: 'Select Co-Authors if any:',
choices: contributors.map(contributor => ({
name: contributor.name,
value: `Co-authored-by: ${contributor.name} <${contributor.email}>`,
}))
},
]
},
commitMessage({answers}) {
const scope = answers.scope ? `(${answers.scope})` : '';
const head = `${answers.type}${scope}: ${answers.message}`;
const body = answers.body ? answers.body : '';
const breaking = answers.breaking ? `BREAKING CHANGE: ${answers.breaking}` : '';
const issues = answers.issues ? answers.issues : '';
const coauthors = answers.coauthors.join('\n');

return [head, body, breaking, issues, coauthors].join('\n\n').trim()
}
}

The file creates a process of questions for:

  • type: align with semantic release message specification
  • scope: affected part of the application
  • message: the imperative written message
  • body: longer description
  • breaking: to determine if it's a breaking change for semantic release
  • Issue: related issue of our ticketing system
  • Co-Authors: people working in the tasks while pair programming

All these questions are asked interactively and not by the brain power of doing manual work.

And you can then add some nice npm scripts in your package.json file pointing to the local version of Commitizen:

  ...
"scripts": {
"commit": "cz"
}

This will be more convenient for your users because then if they want to do a commit, all they need to do is run npm run commit and they will get the prompts needed to start a commit!

NOTE: If you are using precommit hooks thanks to something like husky, you will need to name your script something other than "commit" (e.g. "cm": "cz"). The reason is because npm scripts has a "feature" where it automatically runs scripts with the name prexxx where xxx is the name of another script. In essence, npm and husky will run "precommit" scripts twice if you name the script "commit", and the workaround is to prevent the npm-triggered precommit script.

That is all :) . I will do a special mention to commitlint that is a very useful tool to lint commit messages. I do not use it anymore as it has some overlap with commitizen.

· 3 min read
Alvaro Jose

What & Why Git hooks?

Git hooks are scripts that Git executes locally before or after events such as commit, push, and receive.

These hooks are completely programmable trough bash scripting. Examples of what can be done:

  • pre-commit: Enforce project coding standards.
  • pre-push: Run tests.

This allows us to make sure we are committing the correct things at the correct time. Not breaking our code just because of the mental load of doing things as a manual process that can be forgotten.

How to Start

Add Husky

Husky is a tool that allows Git hooks using JavaScript configured using individual files for hooks in a .husky/ directory.

The fastest way to install husky is by using husky-init, a one-time command to quickly initialize a project with husky:

npx husky-init && npm install       # npm
npx husky-init && yarn # Yarn 1
yarn dlx husky-init --yarn2 && yarn # Yarn 2+
pnpm dlx husky-init && pnpm install # pnpm

It will set up husky, modify package.json and create a sample pre-commit hook that you can edit. By default, it will run test when you commit.

To add another hook, use husky add.

If you are not comfortable using husky-init you can find other options here.

Add lint-staged

Husky is very useful, but it will run natively to git and not focus the commands in our bash scripts for all the files, not only the ones we want to commit.

Lint Staged appear to resolve this problem. It allows you to run the process against staged git files that match a pattern.

asciicast

Install lint-staged by adding it to your local project.

npm install lint-staged --save-dev
yarn add lint-staged -D

In your package.json add it as a script("lint-staged": "lint-staged",) and refer it through a pre-commit hook. If using Husky, this can be found in .husky/pre-commit with the next content:

#!/bin/sh
. "$(dirname "$0")/_/husky.sh"

yarn lint-staged

There are multiple ways to configure lint-staged. One of them is having a lint-staged.config.js file in your project root folder. In this file, you can express what process you want to run for what types of files. For example:

module.exports = {
'*.{ts,tsx}': [() => 'yarn tsc:check', 'yarn format', 'yarn lint:fix', 'yarn test', 'git add .'],
};

The previous snipped runs the compiler check, formatting, linting and test before adding the fixed staged files to the current commit.

Conclusion

With this two tools, we will now be pushing code that will pass similar checks than our CI/CD system.

· 4 min read
Alvaro Jose

Over the last few years, some practices appear to be more a dogma than a value adding practice. One of this is Pull Requests.

Why PR's exist

  • Malicious Code Prevention: Pull requests exist mostly as a practice accepted for zero trust environments (ex. Open Source). An attack vector on this type of environment is the ability of anyone to contribute, meaning you could inject code that could create known vulnerabilities that packages will inherit. That is why maintainers validate code from unknown users.

Malicious actors

  • Highly Distributed Teams: PR's can be use for highly distributed teams (around the clock) as a way to do knowledge sharing. If someone in side A of the world can follow and understand the changes for a feature that is being developed in side B of the world.

Distributed Teams

The issue

IS there any value of doing PRs when people work collocated? What is the cost of PRs in trust environments?

The value that normally people give to PRs is the one of having a peer review process. Nevertheless, we will see later in this article that there are less invasive ways to do this.

Some costs of PRs are:

  • Slow Delivery: PRs are a start and stop strategy where there is a gateway to merge code. This is time that needs to be taken by developers (writting & preparing a PR) and reviewers (reviewing, commenting, etc) to go through the process. At the same time is more time the code takes to get to production (merging, re-testing, etc). This is significant for features but also for fixes, meaning you can go from a response time of minutes to hours.
  • Isolation work: When working on branches, devs work on code that works isolated but needs to be merged with a continuous stream of changes. This means that any test isolated will probably be invalidated upon merging.
  • Lack of ownership: As work is done isolated, the developer who creates a PR delegates part or the responsibility to the reviewer. Humans don't have compilers or containers to run the code in our brain, meaning catching errors tends to be out of our realm.
  • Egos: As catching errors tends to be out of our human realm, PRs tend to become a thing related to preferences (Style, patterns, etc). This hardly provides any value to the code as either tools like linters can do this automatically or it brings premature optimizations.
  • Late feedback: Any valid recommendation is actually provided quite late in the process, when the code has already been written and validated. Causing rework that takes time.

The Alternatives

Pull requests are just one of the asynchronous peer code reviews styles. Is not the only way of doing peer reviews.

Some other types of peer reviews that I, personally, like are:

  • Over-the-shoulder: The bases of this is to have a conversation over what has been or is being implemented. This creates a synchronous feedback loop on an async process. It does not fix all the shortcomings of a PR, but it creates a faster feedback loop.
  • Pair Programming / Mob Programming: The idea is that multiple developers work in the same code base in the same computer, creating a synchronous feedback loop in a synchronous process. This with Trunk-based development allows very fast feedback loops at product level, and with the correct tools generates resilience and ownership among developers.

The disclaimer here is I have worked doing pair programming, TDD and trunk-based development for more than 5 years in multiple size companies, and it has always been a bliss.