Web Development - Self-evaluation

This form is intended to be used as part of mentorship, coaching and training to allow a learning web developer to get an overview of what they know, allow a mentor to know what to expect and generally provide a useful list of skills, concepts and terminology to familiarize oneself with. This is slanted towards open source tooling and the Unix/Linux tradition. So don't expect much Windows and .Net or enterprise-oriented content. It will also give some information and guidance on any topic you rate yourself low on. No information is saved or distributed. You can save your result link to revisit it and update your evaluation. I hope you find it useful.

You can change your responses here as you learn or re-evaluate.

Web Fundamentals

Starting out with some basic and very core parts of working with web development.

HTML

Hypertext Markup Language is the way we show, present and format information and user interfaces for the web. At its core it lets us create documents. It is the original and oldest building block that to this day make up the modern browsing experience. It has picked up a lot of features along the way and at some point we got HTML5. This was turned into a Living Standard and the standards process has been working with that since, rather than releasing specific versions. HTML is based on tags that can start with <elementname> and usually end with </elementname>. A good understanding of HTML is important for web development. For systems development and API work it often isn't critical. It is always useful though. MDN has a an Introduction to HTML that looks promising.

CSS

If HTML is all about documents, CSS is all about style and presentation. Before CSS all presentation was intermingled with the document using elements for changing the font and attributes for colors and such. CSS stands for Cascading Stylesheets and is a very flexible and sometimes a bit perplexing solution to providing style information for a document.

By shifting the presentation aspects to CSS it became a lot easier to make visually interesting designs. It also means HTML became easier to manage for non-visual parsers such as screen-readers for the visually impaired or for automated tools. The markup could be simpler as it didn't need to provide the visuals. It also introduced a healthy separation between how we design a document structure and how we make it look good. Though this separation does not hold fully for heavy styling. MDN offers an introduction to CSS.

Javascript (in browser, with HTML)

The language that ate the web. Javascript was quite famously created in 10 days by Brendan Eich, you can hear some of the history in this video. It was created to add more interactivity to the web. And it did. It has of course been grown, evolved and improved since that initial creation. It is worth noting that it has no real relation to Java aside from Java being a very strong brand at the time. It also bears the name ECMAScript from the standard that describes it. It does what it set out to do, it makes the web significantly more interactive.

It is a bit of an odd language and opinions about it vary wildly. But it largely does not matter what you think about it as a language design. It has spread everywhere and it is a web standard that every web developer should know how to wield. With the advent of Node.js the language also made its way into backend development and general systems programming, desktop applications and everything else. The only real option to using Javascript is Web Assembly which is still quite young. It is not as important to know but might become more common over time.

Of course, MDN delivers a tutorial on Javascript.

Video

Video in the browser mostly became a real thing around Web 2.0 with HTML5 Video. Before that everyone had to go through Flash. Modern web development can use video natively and can focus on making the video element work as we want it to. There are a few formats that work well and standardization has come a long way. Rather than converting your video into 3 or 4 formats you can mostly get away with 1 or 2. It is still important to understand these nuances beyond just knowing that you can put video on the web. It isn't recommended to put an unprocessed videos on the web as they usually aren't in an efficient format. And that leads into encoding, maybe transcoding, quality, bitrates. A deep pit of fun, fun fun. MDN has some information about all of this media stuff.

Browser developer tools

Whether you open it with F12 or Ctrl+Alt+K or Cmd+Alt+J or whatever combination works in your browser. Most browser has a set of development tools. Most of them are built on the ideas that were established and polished in the Firefox extension Firebug. Good browser developer tools allow you to: Inspect the network requests at the HTTP layer to see what you are sending and receiving. Explore the Document Object Model created from your HTML. Debug your Javascript. See errors in a Console. Check loading times. And much more.

Get familiar. If you intend to develop for the web you are going to spend a lot of time together.

Web Frontend Technologies

Now we are getting a bit more into the weeds in the browser. Don't worry if some of these are foreign to you. That's fine. This is not a pass/fail test. Some are common, some are less common.

HTML Forms

The original way of making a web page interactive. Still relevant today. Using input elements, textareas and buttons you can allow your visitors to send information to a server if the server is set up to receive it.

It is sent with either GET or POST requests generally. GET will give you a new URL containing your choices as query parameters, such as mypage.html?firstname=Victor&lastname=Hugo. GET is often used for things like searches and filtering where the selections should be remembered and depending on the details of the site it may mean you can share the URL and another visitor can continue from your choices. POST requests are sent with a different kind of HTTP request where the data is not visible in the UI.

The query strings in a GET request are urlencoded so that they can be safely represented in a URL. A POST form also sends data as urlencoded by default. By default the encoding headers will indicate application/x-www-form-urlencoded. If you want to send a file upload you will need to use a special encoding and you do that by using enctype="multipart/form-data.

MDN has further guidance on forms.

Using APIs from Javascript

An API or Application Programming Interface is a method that is provided for interacting with a piece of software from your code. In the web context we usually mean a server application when talking about APIs. But local code can also present you with an API, for example Moment.js provides a set of API functions for managing dates and time in Javascript. It does not require a server.

But if you want to load some data from Twitter you will need to create a developer account on their site and then read their documentation about their APIs and then you can start using their APIs with whatever credentials they provided. Usually this means sending HTTP requests with JSON-data according to the REST model of API design. More recently GraphQL is a popular alternative for API design. Another common option is WebSockets which is useful for receiving update from the server without asking for them constantly. Sometimes it is only POST requests, sometimes the data is formatted as XML or something else. All of these can be APIs.

For more context on local APIs in the web browser, these are APIs that the browser provides for you to be able to interact with the user and web pages. MDN provides a list of browser APIs.

You can use this blog post as a useful guide into using an API from Javascript.

JSON

"JavaScript Object Notation is a lightweight data-interchange format." - json.org

This means that if you need to send a string of data to your server or read a string of data from your server, JSON is a useful way of doing it. I dare say it is the most common way of doing that at the moment.

JSON showed up seemingly out of nowhere as everyone was getting going with Web 2.0 and AJAX (Asynchronous Javascript And XML) and promptly replaced the XML. That was probably for the best. Where XML is quite wordy and a little bit tricky to work with JSON is a format that slots very well into Javascript. There is some argument whether it makes sense as the most common interchange format for APIs and such. But it is there now and most people use it whether they like to or not.

It can be parsed fairly quickly (though it wasn't created for high performance) and it can be very readable to humans as well as machines. MDN can teach your more about JSON.

CORS (Cross-Origin Resource Sharing)

Some servers might not want you to ask them questions from your Javascript. CORS is one part of restricting this. It uses HTTP response headers and requires the server to state which domain names it will allows to make Cross-Origin requests to it, or you will not get to make requests to it unless you are actually on that server's domain. This can be a common source of confusion and frustration with attempting to load data from other servers. MDN has an in-depth explanation of CORS.

Node.js

Node is a way of running Javascript outside of the browser as a general-purpose programming language, such as Python, PHP, Ruby and many others. One reasonm why this became very popular is that it offered the option to use the same language for the frontend (browser code) and the backend (web server/API). Some even use it to allow sharing some code between frontend and backend, usually called isomorphic Javascript.

Along with this, the Javascript engine from Chrome, known av V8, that was used for Node.js is quite efficient at what it does. So Node has decent performance for such a dynamic language. And a lot of people are used to Javascript and find the event-driven model for asynchronous calls and the event-loop model used in Node.js easy to work with and reason about. And of course it can help that you get the same set of challenges for the frontend and backend.

The Official Node.js documentation has more information in their documentation and guides.

NPM

Node.js has a package manager known as NPM that you will probably encounter when using Node.js. It allows you to download extensions and additions to your code base. Most applications will end up using third party extensions like this. There are also a ton of alternative package managers.

The Node documentation provides some explanation.

Any frontend Javascript framework (such as React, Vue, Angular, Svelte, Ember, etc.)

A framework in software development is a set of tools and/or principles for building your application. Often they provide a lot of generic and general utilities while you are expected to use those to create the actual business logic and visuals of your application or web site.

Some frameworks are large and provide a lot of things, some are small and provide a little. Some are opinionated and give your direction and restriction by their design and some are loose and free giving you very little guidance but also very little restriction.

Currently popular frameworks such as React and Vue are often called reactive as they try to abstract away some of the complexities of working with browser state by means of declarative UI that reacts to changes in the underlying data. Other frameworks have had other approaches.

It is very useful to be familiar with at least one of these frameworks and it usually makes picking up new ones easier. They are especially useful for managing the complexity of larger frontend applications, such as SPAs (Single Page Applications).

It is near impossible to recommend one. As of 2020, React is extremely popular and has been for the last few years. Vue has been a continual runner-up with a solid reputation, Angular fell out of favor but still hangs around, Svelte is a newer and very interesting options, Ember is old-school but has been kept up to date and might provide something quite different from the newer generation.

SVG

A standard for vector graphics that can be used on the web either as a linked file, much like a normal JPEG/PNG/GIF or embedded inline as it is not a binary format. Its primary advantages are visual fidelity, it can always be rendered cleanly at any size or distortion as it does not work with pixels, the possibility to animate it and generally small file sizes. MDN has a thorough reference.

iframes and embedding

Embedding a web page in another webpage is done with iframes today, once a thing called frames were popular but that was the 90's. iframes are rarely nice but often the only solution for certain things. They come with a lot of corner-cases and specific restrictions. But at the core they show a piece of another web page inside a block on your web page. More about this "inline frame" on MDN.

Javascript tooling (Webpack, Gulp, Grunt and related)

The Javascript ecosystem has grown immensely since the introduction of NPM and Node.js. And effort has been put into making it possible to use the same code in the browser and the backend. Quite a bit of effort has been put into reducing the total size of code for the browser by minifying. Precompilers to allow new ECMAScript features in the browser before the browser implements them (called polyfills), preprocessors for different flavors of CSS (such as SASS and LESS), build tools for different templating systems in frontend frameworks. Running tests and development servers. There is a lot of work to be done by JS tooling and these tools try to do most of it. We should all be familiar with the core tools. I find it challenging with Javascript specifically, but I'm old.

I found this rundown of the distinctions useful. This might also be useful.

Javascript test frameworks

Like with any programming language there is a lot to be gained from writing tests for Javascript. To allow this there are a bunch of test frameworks that have been created to allow this. Mocha, Jest, Jasmine, Karma and Puppeteer show up as a common set of options. There are more. Most test frameworks have significant similarities so I would pick one that is fairly popular and with good docs that make it easy to learn to start with. This article provides an overview.

Alternate browser languages (Typescript, Elm, etc)

This might not be something you need to learn but it is useful to be aware of. I would call Typescript a stricter Javascript and I've only heard of, never investigated, Elm. Apparently it represents a really interesting idea of what frontend development could be. What these have in common are that they are entirely separate languages from Javascript that generally compile down to Javascript to run in the browser. Maybe they'll compile to Web Assembly in the future but currently they target Javascript.

HTTP (the protocol)

The web works over a fairly simple technical protocol. It is beneficial to understand it fairly well. It is core to any REST API, AJAX request, web page load. The methods GET/POST/PUT/DELETE should be familiar if you've been in the web dev for a while. HEAD/OPTIONS/PATCH and a few others are less commonly discussed ones. There is the entire concept of HTTP Headers to be familiar with, one of which is the well-known Cookie. Having a decent grasp of how HTTP works makes web development much less mysterious. For more complex topics we have file uploads, multipart and streaming. Understanding of HTTP is built with time but laying a strong foundation is recommended.

I found this site's explanation somewhat approachable. MDN provides a reference that might be a bit more exhaustive.

WebSockets

An update on top of HTTP that allows a steady connection between client and server over which both parties can send messages. This leads to a more clearly event-driven solution. It is primarily used to allow the server to initiate activity and send data to a connected client. This makes it specifically very useful for near real-time information, such as updating graphs or in a chat application. MDN has more in-depth information.

Canvas

The Canvas element and APIs allow the creation of a 2D or 3D rendering context. A 2D context is a fast way of rendering 2D graphics from code on the browser. A 3D context is useful for rendering 3D content, generally through WebGL. Both of these topics are fairly deep. WebGL is a very deep topic. But knowing that they exist and are options is probably enough that if you need them you can reach for them. MDN has you covered.

Responsive design

Responsive design is the idea that you can build a web site or system in a way where it will respond well to changes in viewport size and as such also adjust well to being loaded on differently sized screens. A big 4K desktop screen, a small laptop screen, a small tablet, a big phone, a small phone. They should all work if it makes sense for your site or application. This is often done with CSS Media queries but can be achieved in a variety of ways.

MDN has a good rundown of some of the tools available for responsive design.

Flexbox (in CSS)

A very useful layout model that was introduced into CSS that allowed some layouts that could previously only be achieved by invoking Javascript which is generally not ideal when working with layout. Flexbox provides a lot of flexibility in how to lay out columns and rows, how to make things flow on a web page and so on.

I always end up looking at this page for reference, I find the visuals very useful. MDN has a tutorial.

Grid (in CSS)

I have mostly worked with grid systems from other frameworks such as Bootstrap, but CSS has its own grid now. So this looks like a decent reference and MDN has a tutorial

CSS Frameworks (Bootstrap, Tailwind, etc)

These are collections of layout tools, components, design elements and ideas of how to lay things out that allow you to create a lot of UI without doing a lot of design from scratch. There are both advantages and disadvantages to working this way. I dare say that it is generally beneficial and generally saves time. Each framework is its own beast and I suggest reading up on what they offer and how the approach things. This one compares Tailwind CSS to Bootstrap from the perspective of Tailwind as the new challenger.

Server Fundamentals

Going towards the backend, but still very much relevant to serving your frontend code. Let's take a moment to consider the server.

Linux (operating system, commonly Ubuntu, Debian, RedHat)

A big area. It is another operating system. Distinct and comparable to Windows and MacOS. It comes in many flavors, called distributions, such as Debian, Ubuntu, Redhat and many more.

If you are working in the open source end of development tools and languages you will benefit from any and all experience with Linux and unix-style operating systems. It comes back in command line skills, it comes back in Docker, it comes back in most server administration.

One way to start is by using a "live" image which allows you to try the operating system on most computers from a USB drive. You can find a live install image for Debian here. Note: Debian is the core on which the more well-known Ubuntu was built.

Using the command line (aka. CLI)

Just like pointing and clicking on icons and buttons can be a user interface (UI). A command line interface is generally based on entering commands on your keyboard in a text-centric presentation. Going deeper we generally get into using the common "shells" such as Bash and Zsh that provide ways to run applications, combine applications and their input, output.

Some enthusiasts can comfortably perform all of their work from the command line, others use it mostly for programming-related tasks. Some use it sparingly or avoid it. Either way, it is incredibly useful to be comfortable with it. Building simple scripts in shell scripting (bash, zsh, et.al.) is incredibly useful. Being able to interact with programs beyond what graphical interfaces allow is very powerful. Especially when working in development.

There is no definitive guide to working with a CLI as a general principle. Don't be scared of the text and terminals. They are incredibly useful.

SSH

A protocol for Secure SHell access. That is, it allows a user to connect to another computer and operate it using its command line interface. Sometimes you'll use a password to authenticate that you are who you claim. But preferrably you should be using public key cryptography to safely connect.

This is the standard protocol to transfer information to and from git repositories, it is the standard for controlling servers and their software, it is a common way of accessing computers you have at home. It can also be used for a multitude of other things. Using it for SFTP or SCP replaces the need for insecure FTP. The SSH connection is encrypted and as such is much more secure than what came before it, such as telnet or FTP.

This seems like a decent chunk of information on SSH and working with it.

Bash (shell scripting)

When you open the terminal in most operating systems you are working in a shell. Windows is leaving cmd.exe for powershell. Linux has countless options but tends to default to bash, MacOS recently left bash for zsh. If you install Git on Windows you get Git Bash which is a windows version of bash along with some supporting software.

These shells are very powerful ways of combining different tools and facilities of your computer to do things that the graphical UI practically won't allow. Especially for automating and repeating tasks. You can build logic, you can build iteration. The possibilities are endless and the simple things are useful and convenient.

I don't have any definitive guides for either of these shells but this seems useful for bash.

vi/vim (text editor, known for being hard to close)

A very powerful, very unintuitive text editor. It is CLI-based, so text-based and works over SSH and such. It doesn't work like notepad. It is mode-based. I recommend getting familiar with it because it is the default on some server installs and it can be fairly annoying to override. It is also very useful. Some people use it exclusively, they are not necessarily just showing off either. I think this guide seems alright to get started.

You can get out by hitting escape a few times and then typing :q! and hitting enter.

nano (text editor, known for being useful on servers if you are not comfortable with vi/vim)

A simpler editor that is much more friendly and works more liked you'd expect from common GUI editors. It is also CLI-based and works over SSH. I don't think you'll need a guide. But knowing that the command nano gives you a text editor is probably good.

SSH keys and authentication (private/public keys)

This goes back to the SSH subject. Understanding the use of private/public keys (not the math, that's not necessary) is important to efficiently and effectively use them and keep them secure. Knowing how to manage local SSH configuration in ~/.ssh/config is powerful. Knowing how to grant access to a user account on a server to certain private/public key combination is key when running servers.

This seems like a decent chunk of information on SSH and working with it (same as for SSH).

Linux package management (often apt, sometimes yum, pacman)

When working with Linux you usually don't need navigate to a website to install applications or packages. All the open source software and free software is usually available through a package manager. A kind of predecessor to the App Store idea without any need for payments. Get familiar with the package management on your Linux. Check the documentation for your distribution.

cron (scheduling tool for *nix environments)

With all the power of CLI tools and shell scripting. Here we have something that allows you to get stuff done when you aren't even at your computer. Or that allows your server to do scheduled work in a recurring fashion.

Cron is a way to say "run this command at this time every day" or "run this every 5 minutes" and many other variations. It is standard on most linux/unix style systems. A lot of it seemed to be covered in this piece.

Web Server Technologies

Getting more specific to the Web now. Some of these are quite core others less so. Some of these may sound scary, they are just words. They are mostly quite simple once you get to know them.

HTTP (the protocol)

The web works over a fairly simple technical protocol. It is beneficial to understand it fairly well. It is core to any REST API, AJAX request, web page load. The methods GET/POST/PUT/DELETE should be familiar if you've been in the web dev for a while. HEAD/OPTIONS/PATCH and a few others are less commonly discussed ones. There is the entire concept of HTTP Headers to be familiar with, one of which is the well-known Cookie. Having a decent grasp of how HTTP works makes web development much less mysterious. For more complex topics we have file uploads, multipart and streaming. Understanding of HTTP is built with time but laying a strong foundation is recommended.

A common web server (Apache, Nginx)

These web servers can host normal (static) files, proxy requests to "application servers", which are usually just HTTP servers (such as PHP-FPM or uwsgi). They are good for controlling headers, caching performance, gzip, SSL certificates. A lot of important stuff.

Apache is the veteran in this fight but has generally been overthrown by the up-and-coming Nginx since a bunch of years back. It is still a fairly common choice in the LAMP setup (Linux, Apache, MySQL, PHP) and still a perfectly valid choice.

Serving static files with a web server

This is a basic thing that is incredibly useful and easy to gloss over. I think most devs benefit immensely from being able to set up these simple web servers that can perform incredibly well. Serving files rather than generating content on-demand from a dynamic application is very efficient. This used to be most of the web. I doubt it is now.

I would read up using the Apache or Nginx documentation or some tutorials.

Serving web applications by reverse proxy using a web server

A proxy server is normally a server that your traffic passes through on its way out to the things you want to visit on the internet. A reverse proxy is a web server that stands in front of the final server for something like a web application.

Many application servers that serve content over HTTP are not optimized for handling live traffic. They are intended to stand behind a reverse proxy such as Apache or Nginx that have all sorts of interesting defences and mitigations against misbehaving clients, malicious requests and generally have been tuned for good performance over many years. This is usually achieved with a reverse proxy setup.

curl (command line HTTP client)

The most popular HTTP client software in the world I imagine. Created by a swedish guy and used nearly everywhere either as the command curl or the software library libcurl. This is a useful tool for testing HTTP calls, downloading information online or automating HTTP workflows in scripts. It is worth getting familiar with.

You can read more on curl.haxx.se.

HTTPS

I recommend knowing what it is, what purpose it serves and how to set it up on a web server.

HTTPS is an encrypted version of the HTTP procotol using TLS (Transport Layer Security) or SSL (Secure Socket Layer). So it has two different means of sending encrypted data. TLS is the newer one.

HTTPS prevents people from eavesdropping on data being transferred over the web, such as passwords or cat pictures. HTTPS also allow us to know with a high degree of trust that the site we attempted to connect to is the site that sent us the information we are looking at. So it protects against both spying and tampering.

HTTPS is essentially mandatory for web applications that do anything with user data. The easiest way to get the necessary cryptographic certificate is using Let's Encrypt. Their guide is here. This used to be both costly and complicated. Let's Encrypt has really simplified things.

SSL/TLS

Knowing that SSL/TLS is not just for HTTPS and having a basic understanding of what the cryptography requires of you is useful. OpenSSL is probably the biggest library and suite of commands around for doing cryptographic stuff and you can use that to generate certificates, encrypt data, decrypt data and much more. This is a profession on its own so a basic idea of what it is will be plenty.

LetsEncrypt

The project that provides trusted SSL certificates to most new web projects. SSL used to be costly and complex. Get familiar with their tech and how to automate their renewal process with the certbot tool. You don't want to do this another way unless you are on a major cloud provider that does it for you. Read more on letsencrypt.org. This is not an ad, they are a public benefit organization.

Creating thumbnails on the server (Imagemagick, GD, or similar)

Sometimes you want to adjust the size of an image. There are libraries in many programming languages. There are also good command line tools that you can "shell out" to for this functionality.

This is also very useful if you need smaller copies of tons of personal photos. Scripting is great for this.

Here are a bunch of examples and techniques for imagemagick.

Adapting video for the web on the server (ffmpeg, usually)

Video is a lot more complex than images. But anything you want to do with video can usually be achieved by the ffmpeg tool. If you process video uploads in a web backend you need to transcode them to reliably show them to people. Browsers only support some formats. Notably H.264 (MP4) is reliable. You probably also want to process them to reduce the size. You can read more about ffmpeg here.

RSS

Slightly old-school in some ways. This is a format for subscribing to new information from a site. These days it is mostly used a lot for podcasts. An RSS feed or Atom feed is a list of content, usually static that can be kept up to date with content changes and allows machine systems to parse it and figure out when things are updated.

WebSockets

An update on top of HTTP that allows a steady connection between client and server over which both parties can send messages. This leads to a more clearly event-driven solution. It is primarily used to allow the server to initiate activity and send data to a connected client. This makes it specifically very useful for near real-time information, such as updating graphs or in a chat application. MDN has more in-depth information.

Database Technologies

Most developers get familiar with a single database when they start out. Learning more of them is generally fairly straight-forward. Don't expect to know all of these. The selection is a bit arbitrary but also a set of very common databases.

SQL

SQL (pronounced sequel usually) is the Structured Query Language. It is a standard for an incredibly flexible way of asking questions of and giving commands to relational databases. There are tons of tutorials. You probably want to follow one specific to your relational database as they tend to differ a bit.

This is an incredibly useful and important language to be familiar with.

MySQL/MariaDB

Often the default DBMS (Database Management System) for PHP projects as they grew popular at the same time and are usually both available on cheap hosting providers.

There are tons of tutorials out there for this and most useful is probably working with it through your programming language to some extent. Here are the MySQL.com docs.

PostgreSQL

Known for higher reliability, correctness and a stricter approach than MySQL, Postgres was known to be slower a long time ago. I believe they've caught up on speed since. It remains an incredibly appreciated, extremely full-featured and competent database system. My general recommendation for most use-cases.

The docs are quite comprehensive.

SQLite

An embeddable database system. SQLite is powerful SQL database that can live in a single file on your hard drive while the others need to run their own separate software. It can be used for surprising scale and performance but has distinct scalability trade-offs at certain points.

It is most commonly used in embedded devices and has shipped on Android and iOS by defualt for ages. These are their recommendations for using it.

Redis

A popular datastore of the NoSQL school. Redis is often used as a in-memory only store to provide a fast cache system for frequently accessed or expensive-to-compute data. But it is very flexibel and can be used for many different use-cases.

You can read more about it on redis.io.

Elasticsearch

A document store-style database focused on providing extremely flexible and powerful tools for search. It can be a bit heavy for some applications but is a full-featured search engine. Stemming, stop-words, faceted search, weighting and more. The feature-set is huge.

You can read more about it in the docs. The software is open source and you can run it yourself, Elastic can be a bit heavy-handed in selling their cloud services. Easy to get confused.

MongoDB

A NoSQL database, document-based. It won incredible popularity a bunch of years back and then drew significant technical criticism, probably resolved most of the problems. It remains an option for some types of work. I won't say much more as I don't have heavy experience in it.

You can try it by following their guides.

Developer Tools & Tasks

Let's approach the whole programming thing slightly from the side. There are many, many things that are part of the craft of programming without actually being the writing of code. It is quite common to not be familiar with every piece of lingo and terminology whether you've used a tool or not.

Git (or other source control, but mostly Git)

Number 1 on the Joel Test.

Any professional attempts at software development should include a solid grasp on working with source control. Git is the industry standard. There are options but they are miniscule in comparative popularity at this point.

Git is a tool which keeps track of changes to files. Ideally text files, such as software source code. By committing your changes at regular intervals with messages noting what you've changed this allows you and other developers to go back to previous versions, see what changed when bugs were introduced, keep developers in sync when working on code separately and much, much more. Source control is not optional and Git is the most popular one.

A strong recommendation is knowing the difference between Git and GitHub. Github is a service offering Git repositories along with a lot of related and unrelated tools (issue tracking, pull requests, CI). Github is an extremely popular way of working with Git. But it is not Git itself and it did not create Git.

Now, don't be concerned if you find the Git terminal commands confusing or the concepts foreign. It starts out like that. Just keep at it. Learning the CLI way of working with Git is probably the best way of learning Git deeply. But it can also be quite frustrating. This is a skillset that takes years to internalize fully but which you might be able to pick up to a useful level in a few days or so.

A code editor

You need a way to edit source code. It needs to be better than Notepad. Other than that the field is fairly open. Vim, Emacs, Visual Studio Code, Sublime Text and many more.

All IDEs tend to include a code editor but I didn't list them here as I think it is beneficial to know a lightweight code-editor that doesn't attempt to be a full IDE. Even if you prefer an IDE.

The basics of a code-editor is generally that it works with plain text, not something like Word or Pages. It can often highlight parts of code for better readability, so-called syntax highlighting. It can make sure to save a consistent encoding (generally you want UTF-8). Most people use a monospace font, that is the letters are all the same size which makes things line up better.

A REPL

A Read-Eval-Eval-Print-Loop. Usually a CLI tool that allows you to write programs line by line to see how something executes. This is what you get when you just write "python" in the terminal. This is most common and most used with fairly dynamic languages.

Some people use the REPL heavily, some use it sparingly. But being familiar with it is another tool in your belt.

An IDE

An Integrated Development Environment. This is the solve-all-your-problems approach to editing, debugging, running and building code. It will run your tests. It will download your dependencies.

Visual Studio Code straddles the line between editor and IDE as you add more packages for language support. The other Visual Studio is a full-fledged IDE by default. The IntelliJ Editor (WebStorm, PyCharm, Android Studio, and more) is an IDE. Eclipse, NetBeans are IDEs.

The IDE approach seems more popular in .Net and Java which are fairly verbose and strict languages where this type of tooling can do more automatically and the need for supporting tools might be a bit stronger due to the verbosity and strictness.

Another tool in the belt to know about and consider.

In-editor debugging

The ability to place a breakpoint in code, run you code and have the debugger stop at that point in the code and show you what the situation is, is incredibly powerful. Add to that the ability to step forward, run code and so on. Get familiar with it, see how it works with your workflow. There are many developers that swear by debuggers and many that can't quite get into the habit. Debuggers are useful and they are easier to use with editor integration. So get familiar with debugging in your language.

Deployment tools (Ansible, Docker, etc.)

This is a whole field. I would suggest getting familiar with the dev-ops concepts of automating deploys and automated testing at an overarching level before committing to learning it all. Learn what you need to learn. If something sounds useful, investigate closer. Automation and tooling is usually a net benefit for anything that is time-consuming or error-prone. But infrastructure should not be allowed to be the death of trying a new project.

Deploying code to a production environment

If you've never put a web system online a lot of what you do is still academic. Build something simple, get a domain, put it online. Make sure you read how deployment is supposed to be done with your web framework or whatever you are working with. Sometimes there are big differences between development and production.

This is where a lot of stuff becomes relevant. HTTPS, security concerns, performance, cost, scalability, efficiency. You don't need to solve all of this for putting a first thing online. Just go get those battle scars.

Load testing tools

How do you know if you system or site will hold up under pressure. There are quite a few ways of esting this. A load testing tool is a tool for putting your system under load from something similar to real traffice to see how it reacts and functions.

Knowing that these tools exist might save you from building your own. Knowing that this can be tested can save you from some nasty surprises. Many systems have also been launched with properly load testing. This is another tool, not a required check box.

A Web Framework (such as Rails, Django, Laravel, Express)

Knowing one of the more popular and reliable web frameworks does wonders for productivity in backend web development. Don't discount them. Pick one for your ecosystem, get familiar. It often does a lot more than you think you might need. Usually for a pretty good reason.

CI/CD tools (CircleCI, TravisCI, GitHub Actions)

Automated tools to run automated scripts for automated testing, deployment and such. These are often service platforms that will charge beyond certain usage. If you want to keep a project low on errors, finding out when errors are introduced is incredibly useful. This is a central piece of working with modern reliable software development workflows.

Worth reading up on to know of them and their purpose.

Bug tracking/Issue tracking

Number 4 on the Joel Test.

Knowing what needs to be fixed, knowing if progress is being made on it and whether it should be fixed at all is part of the job a bug tracker or issue tracker can do. Jira and GitHub Issues are two very common options, there are tons more out there. They all largely do similar things in slightly different ways.

Open source projects also often have public issue trackers where you can report issues and follow up on problems. Or find things to help with!

Open source licensing (GPL, MIT, BSD, etc)

A surprising number of developers have developed a touch of lawyer-thinking. Partly because licensing is a core part of both Open Source and Free Software. Both of these phenomenons are of course heavily influential on developers.

Being familiar with the larger parts of what separates the GPL from "permissive" licenses such as the MIT or BSD will benefit most software developers in the long run. Especially when you get questions about open source from your bosses.

Cloud Computing & Proprietary Server Environments

It is still perfectly possible to run a system entirely without the help of the major cloud providers or using other providers. But familiarity with them is a relevant point of inquiry. Don't sweat it though, no one knows how it works ;)

AWS

Known for having specialized services for everything along with the basic offerings of computing power, storage, etc. And for the user interface being a frustrating hellzone. So a common choice for running basically anything.

Google Cloud Platform

Probably mostly distinguished by the big data stuff where I think they are quite far along. Nice UI than AWS arguably, not necessarily less confusing.

Microsoft Azure

Can't say much about them because I haven't used them. Supposedly pretty nice to work with overall. Bonus points if you need Microsoft infrastructure elements with Windows-related stuff of course. But I believe they cover and compete on any and all of the feature-set of the other two big ones.

Programming

As you might have gathered, only part of the job. But a significant part of it.

A programming language (PHP, Python, Ruby, Node.js, C#, etc)

You need to know one. It is usually good to not go wild on learning a lot of them early on. But over time its perfectly reasonable to either specialize deeply in one or two, or to go wide and learn a lot of them. In the end. for web dev, you are bound to learn Javascript since it is used for browser-side development.

Writing tests (Unit tests, functional tests, e2e tests, any of it)

Building the understanding of how to build reliability in systems and in developments processes by creating programmatic tests. It can be quite dull and repetitive sometimes. It can also be an intricate and satisfying part of your development process. It varies. Start by getting familiar with whatever testing libraries are standard practice for your programming language and build from there.

This is key to being able to automate deployment workflows and build confidence in your systems. It is also, interestingly, something you can entirely skip. You can still build working systems. But you can't necessarily build the same confidence in them. I found the book Accelerate: The Science of Lean Software and DevOps convincing on this end.

Measuring performance/Profiling

Usually requires a combination of tools, libraries and direct programming. Sometimes you need to drill down into why something is slow or verify that something is fast enough to match requirements. The important terminology here is performance, timing and profiling. The details will vary by language.

Object-Oriented Programming

The dominant paradigm for most of the last few decades. Classes, objects, maybe even interfaces. You should know about this stuff, get a feel for what level of it you appreciate. There is quite a difference between how OOP is done in Java versus how it is done in Python. But both are object-oriented languages.

It has only in more recent years been really challenged by trending Functional Programming paradigms and some that are neither.

Functional Programming

Some would say FP is more mathematical. And if you have the computer science degree it might be. Suffice to say that compared to OOP it can seem simplistic or if you stumble to deep into the terminology, extremely academic. It is a useful paradigm with a different set of trade-offs than OOP. It is growing and can be a good tool in the belt.

Compiled code vs. Interpreted code

Knowing what this is and what the trade-offs are can be perfectly sufficient.

Muddying the waters we also have Ahead-of-Time (AOT) and Just-in-Time (JIT), that are two ways to compile things with very different ways of working. That's mostly an advanced topic. If you have a decent idea of what a compiler and compiled language can be like vs. working with an interpreter, you'll have a good enough idea.

Examples of compiled languages (C, C++, Go, Rust), compiled languages that compile to an intermediate format (C#, Java, Erlang, Elixir), interpreted languages (Python, Ruby, Node.js, PHP). There are some blurring of these lines with things like Python compiling files for subsequent runs and such. But unless there was a mistake, this generally holds.

IO-bound vs. CPU-bound

This is mostly a way of reasoning about system performance. A CPU-bound task means that the end-user will be waiting for your code to finish a CPU task. An IO-bound task means that the source of any waiting is IO (input/output), such as another server of the network, a storage drive or a remote API of some sort. A lot of web applications tend to be IO-bound most of the time, waiting for database query results and that usually means that optimizing for "fast code" might not give the results you'd hope compared to optimizing database queries or adding a caching layer.

Architecture

This is not generally considered entry-level. But depending on one's reading interests all sorts of things can be familiar.

Microservices

The art of dividing a system in multiple mostly independent pieces of software that can be worked separately. This is often done to allow a few differents typs of scaling up a software system. Either scaling specific services on their own or scaling a team to a grander size without them stepping on each other by dividing ownership and responsibilities.

There are numerous articles, blog posts, etc. about the trade-offs inherent in working with a microservice architecture. If you think it might be something for you, educate yourself.

Monoliths

The monolith is the counterpoint to the microservice architecture. It doesn't necessarily mean that you avoid separating concerns but it usually means that you do not separate software concerns with network boundaries. There is less overhead than a microservice architecture but it has other tradeoffs. A popular disadvantage is that it makes it easier to be sloppy in thinking about your data. There is a lot of potential for making a more tightly integrated system, in both good and bad ways.

This is covered by mostly the same articles and blog posts that cover the trade-offs of microservices. Most popular web frameworks are designed to build fairly monolithic applications by default, though it isn't really mandatory.

Event-driven architecture

This can mean a number of things. An architecture that works of the very specialized architecture known as Event Sourcing or just an architecture that builds heavily on events in some fashion. These are fairly in-depth and diverging topics. Enough to know that there are more architectures than you can imagine. Some of them are very interesting, many focus on events.

Security & Safety

This is a career in itself, but we all need to be familiar to some extent. A lot of these are more prescriptive, you must, you have to, mandatory, required, must not. This is to emphasize the importance. If you disagree, knock yourself out.

Password hashing

Do not use MD5. SHA-1 is also wrong. They were not intended for passwords and should not be used for that.

Investigate current recommendations for password hashing. The idea is that your system must not know the current password. If you can provide the password in plain text, something is wrong. Hashing is a one-way process. Cryptographically safe hashing should be quite infeasible for most potential threats to break.

So how do you log a user in and check the password they type? You perform the same hashing on the password they provide and compare the hashes. If you got the same hash as the one stored they gave the same password. You never decrypted the password. You shouldn't be able to.

Knowing that things work this way and should work this way is very important. Another option with growing popularity is to avoid storing any password information and relying on other authentication systems such as an e-mail link, banking login or 3rd party authentication.

SQL Injection Vulnerabilities

If you aren't thinking in security-terms and writing some PHP you might write this:

mysql_query("SELECT * FROM users WHERE email='$EMAIL' AND password_hash='$HASHED_PASSWORD';")

Something similar is possible in most languages. You put the variables you want to query about inside the query in the simplest manner possible. Now consider what happens if someone enters a random password and the email ' OR 1 = 1 OR ''='. You'd have this query:

SELECT * FROM users WHERE email='' OR 1 = 1 OR ''='' AND password_hash='asdf';

Which would just give you the first user overall. This is why you should pass variables carefully in accordance with whatever you database library demands so they can escape your input properly. This does not only apply to login queries. Miss a single one of these and a visitor can remove most of your entire data with some experimental DROP TABLE or DROP DATABASE queries.

Security mailing lists

If you are maintaining a system using a number of underlying systems, such as a Linux distribution, web frameworks, media transcoding software, depdendencies from the language ecosystem. Any of these can introduce security vulnerabilities. They are a fact of life in software. If you can track down security news sources for all the software you depend directly on and subscribe to those newsletters, RSS feeds, mailing lists, whatever it might be. Do so. And skim them efficiently. They generally do not post unnecessary things and should be low volume.

Then you know what you must act on.

Backing up data

Absolutely critical for any production system. You need to store copies of your data and transfer them away from your system. Decide how far back you want to go, what the tolerable interval is (the worst loss you can get is if everything burns down just before you backup process runs). There are ways of building redundancy as well that can help here where the data lives in multiple places. But backups are necessary either way. There are many tools for this. Often you need to back up your database, your DBMS should have more information on that. Any files you need to back up can be backed up with any of numerous backup tools.

Testing backup restoration

A backup is basically pointless if you do not confirm that it can be successfully restored. This should be done automatically if at all feasible. It must be done. An untested backup solution is only a maybe. It is entirely too often taken on faith.

There is also the concept of running a fire drill. Which is usually a time when you as a software team decide to verify your restore and recovery procedures for either setting things back up from scratch or restoring a backup. This can identify important practical challenges in executing your recovery plan. This is not as common as one would want it to be.

Application monitoring & metrics

How is your application performing? Are things up and running but super slow? Is something down and parts of the system are throwing errors. There are systems, both premium services for pay and free things you can host yourself. This sort of tech often generates quite a bit of data and can be a bit challenging to do right. It is incredibly useful for identifying issues and if it provides enough information it allows digging down deeper into the details.

Logging

Outputting information about what the system is doing, often with a focus on the most important operations, such as requests for an HTTP server. Or exceptions, errors and other types of problems that need to be identified and investigated.

Most systems benefit from logging errors, warning and things that are somehow not quite working right. This is often also used a lot during development.

Sometimes a system will log to local files, sometimes to an external system, sometimes to both.

You can change your responses here as you learn or re-evaluate.