Big update!Introducing GitHub Bot Commands. Learn more
Socket
Log inDemoInstall

Goals for Modern Online File Explorers

File explorers are great tools for programmers when they can let code be understood, but what does it take to ship a file explorer and what does it mean to help programmers by providing one.

Goals for Modern Online File Explorers
Bradley Meck Farias

January 23, 2023


npm has shipped a file explorer a little bit ago and it is rad to see! It is a wonderful feature but people may not understand just how big of infrastructure that something seemingly so simple needs to take. Engineers have to dig out lots of backend work and designs to handle something like this.

Like many things that appear to be simple lots of work and effort needs to be done for achieving the goals of something even so simple as a file explorer. Modern expectations might also make what seems simple actually be quite complex or even represent a shift in needs or desires of the average programmer. So, for a given tool we have to start asking questions and lamenting about all the complexity like:

  • Does the tooling happen on the server or client?

    Doing things on the client does allow maximal flexibility but also means that page loads could be slowed to unusable speeds potentially. At Socket Security we are mixing these behaviors and most other modern sites do as well. Some things can be cached or need information that just isn't local to a file, but other times it could be that you can cache information better if it is on the client! A good example of this kind of tension is code highlighting. If we do code highlighting on the server that means that updates to the highlighter will invalidate the cache, and sending the highlighted code to the browser might want to be a separate cache as well so that the page containing it can be updated without removing the cache for the files. If the highlighting is done eagerly the pages will load faster after being computed but the file may never even be seen before a new version is published and now there needs to be 2 copies of the file so that a person can download the original potentially.
  • Is there a maximum size of file to display?

    If a file is too large to be easily understood just by viewing with the tools available via the web or if the tools break down due to size what should be done? Lots of tooling that does complex analysis of a file may simply be too slow to do eagerly and servers will do it on demand. Even then, some of the complex tasks may simply take too long and you don't want to hold up showing useful information while you wait for just a little bit more to finish being calculated. This kind of stuff is especially important for diffs viewing.
  • How are files stored?

    Content addressable storage is pretty normal these days and helps prevent having a situation where you are exposing terabytes of storage directly to the internet. This is particularly important for things like Socket or npm where the files are often duplicated across package versions.
  • Is the file explorer used for scraping?

    Bots, search engines, and other automation will crawl around file explorers quite often. This kind of scraping can be a supported workflow, but often it isn't the intended use of these things. If you are placing a file explorer in a public location, there is tension on rate limiting automation but allowing general intended usage.

All of these concerns and questions are just the very smallest bits of shipping a full file explorer at the scale of the npm registry. Which is quite impressive to see! It is easy to forget just how large the npm ecosystem is given how simple and seamless the file explorer looks. Most things couldn't handle how big the bandwidth of new packages are coming in, let alone setup the infrastructure to maintain tooling around all versions of all packages.

All of this though is a bunch of questions about what your tool can handle, not what a user wants or the goals of using the tool itself!

At Socket the goals of using a file explorer are not to just see individual files but to be able to peruse an interconnected set of dependencies and how they cause cascades of important information such as issues from dependencies to tracking where a variable came from. While our goals are around the product we serve; other companies like GitHub are doing things like providing enhanced searching for things like symbols in your code. These features are important to people visiting your file explorer because they alleviate the need to do something like clone a git repository or install a npm package to understand it.

In order to understand code, programmers have come to expect a certain level of features from their programming environments and those should be available on the web too. In the recent months this has become an even more complex topic as AI begins to have the ability to try to generate code or even explain code. Here are Socket Security we are trying to reach a level of comfort and familiarity while you use our product that allows you to understand at a glance rather than needing to study carefully. This would let you go from exploring to using (or not using!) packages quicker to accomplish you own goals rather than specifics of how code was written.

These goals come into a few basic things to keep in mind:

  • Programmers expect visualizations to help them.

    They expect highlighting to occur in their code. This helps them to quickly understand the structure of how code will be interpreted rather than needing to maintain close context on where and when a specific word was written. This goes doubly so for some nice features like understanding if something is being able to see if something has a matching amount of parenthesis. Seeing errors as a red squiggle or some other method of visualization helps to quickly determine problems or quality concerns.

    Some specialized tools for things like disk usages actually use background colors as meters to show data and help exploration within file lists themselves as well. Use up the screen space, but don't get too noisy!
  • Programmers expect to be able to find where some code is referring to.

    Modern tools provide context to variables and imports. NPM being able to navigate a code base is greatly sped up by links rather than manually finding where something was defined or what files it pulled in. Luckily, for most modern toolchains you can use a Language Server and even run it on the server ahead of time if you need to.
  • Tools are starting to appear that programmers can use to help themselves without direct assistance.

    People are playing with AI to get high level help instead of just pair programming to learn code bases. Things like ChatGPT can actually be asked questions and get some quick insights about code. We can probably expect this to start being used to help explain code that a programmer is unfamiliar with and be integrated into web file explorers potentially through things like browser extensions.

Programmers expect to be able to share code. In general when you are perusing an unfamiliar code base you may want to copy a link to a specific line of code and be able to come back to it. This could be to ask another person a question by sending them a link, keeping a bookmark for yourself for future reference, or even just because you like collecting fun snippets of code for memes. This is a bit complex though, as if you store the data as content addressable storage, you most likely would lose data if you link to it directly. Instead the context of the page needs to provide the data.

  • If the hash for the storage of a file is shared across multiple versions of a package might; direct links to lose information since it doesn't know the context from which the file contents were accessed. So, if code has a dependency on ./bar.js from ./foo.js any permalink you provide to ./foo.js needs to know which version of ./bar.js to link to! So each permalink actually needs to hold enough information to actually recreate that. Luckily this can be done using npm package versions for Socket's web interface and you can see 1.2.2 in https://socket.dev/npm/package/minimist/files/1.2.2/package.json so that jumping to the package entry point will preserve the right version from that file. For other cases like GitHub they pin to a git commit like https://github.com/SocketDev/socket-sdk-js/blob/107ff4c81316d3c9dba960c1cb18d2aab7d9c4aa/package.json (psssst, if you want to get this permalink there is a shortcut of pressing y to make the UI add it for GitHub).
  • Information needs to be locally relevant. While showing a line / column of an error is great, being able to directly take the programmer to that line / column and preserving that information is even better. This also needs to be shareable and kept in the URL. IDEs will often expose this with a "problems" panel somewhere for easy reference across all the files while you peruse, clicking on a specific issue will take you to the file location. This is why we add additional tooling to quickly jump to issues for our tooling like at the issue lists in minimist - Issues - Socket directly linking into the file explorer.

Socket Security is very excited to see npm ship their explorer and are excited as things are starting to have a breakthrough in potential evolutions that we will see in the next few years about semantic understanding provided by file explorers! It goes both ways as well, lots of programmers who have already installed things don't want to have to leave their tooling to get help, and we can expect things to start to be improved directly in their programming environments to utilize file explorers whenever they can sync up.


Back to all posts
Socket[email protected]

Product

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc