Thursday, May 25, 2023

Viewing Large Text Files

I recently needed to view and search a 20 GB log file and realized that I don’t know of any Mac text editors that are disk-based, i.e. that don’t load the entire file into RAM. Wikipedia has a list of editors with Large File Support, but it seems to be more about not having artificial RAM limits rather than supporting files that don’t fit in memory.

What I ended up doing in the moment, because I knew it would work, was to use split to break the file into smaller chunks. I then did a multi-file search with BBEdit.

I considered Hex Fiend, which is disk-based. It’s fast and has a find feature, but it’s less than ideal for this use case because it doesn’t support UTF-8 or show line breaks.

For short tasks in the future, I will probably use less, because it can directly open and search large UTF-8 files. But for longer tasks I don’t want to be working in Terminal.

7 Comments RSS · Twitter · Mastodon

Matthew Vinton

When it comes to viewing giant multi-gigabyte log files I haven't found something that works better than glogg. https://glogg.bonnefon.org It isn't Mac-like but it is open source.

Hal Mueller

I wonder how Emacs would do with that. I have certainly used Emacs to abuse many computers with very big file edits.

@Matthew Thanks! glogg was able to open my file using only 700 MB of RAM.

@Hal Emacs loads the whole file into RAM.

Pierre Lebeaupin

I would go further and remind Hal that emacs is the poster child for loading the whole file into RAM. Beforehand, Unix tools would, if not work in a streaming fashion in the first place, at least load file by parts (this did not include editors as we know them, because they did not really exist: people used to work with line editors back then). It's really the GNU project and its lead, with emacs as the flagship, that led to the everything-in-RAM Unix world we find ourselves in.

The same happened on the Mac: BBEdit made the same forward-looking decision to load it all in RAM right away back in 1992 when that was still contentious.

Avoiding the loading of the whole file into RAM seems to have developed for more specialized tools that could assume more structure (e.g. editors dedicated to a programming language), or on the contrary file types without structure as is the case for hex editors, but never grew beyond a niche for text editing in general.

@Hal has a point: you can (ab)use Emacs to load files in chunks. Buffer content is in RAM, plus some leeway around it. Visit Large File is the key; code on GitHub:
https://github.com/m00natic/vlfi

If you are willing to use the terminal, 'less' is not memory limited. It can browse large files and it has a bunch of nice features such as:

* Wrap or unwrap display mode
* Search forward and backward
* Bookmark positions and jump to bookmarks
* Pre-processing (you can open .gz log files)

You didn't mention whether this is a system log or some other sort. If the former, Howard Oakley's apps are indispensable:

- [Mints](https://eclecticlight.co/mints-a-multifunction-utility/)
- [Ulbow](https://eclecticlight.co/consolation-t2m2-and-log-utilities/)

Leave a Comment