... filetags example demonstrating: controlled vocabulary file
=~/.filetags=, tagging multiple files at once, removing tags by
prepending a minus character, tagging using the proposed number
shortcuts, tab completion of tags via =Tab=, and mutually exclusive
tags (switching from =draft= to =final= without removing =draft=).
This Python script adds or removes tags to file names in the following
form:
"file without time stamp in name -- tag2.txt"
"file name with several tags -- tag1 tag2.jpeg"
"another example file name with multiple example tags -- funvideoskids.mpeg"
"2013-05-09 a file name with ISO date stamp in name -- tag1.jpg"
"2013-05-09T16.17 file name with time stamp -- tag3.csv"
The script accepts an arbitrary number of files (see your shell for
possible length limitations).
Target group: users who are able to use command line tools and who
are using tags in file names.
Besides the fact that I am using [[https://en.wikipedia.org/wiki/Iso_date][ISO dates and times]] in file names
(as shown in examples above), I am using tags with file names. To
separate tags from the file name, I am using the separator
"space dash dash space".
Tagging files this way requires a file renaming process. Adding (or
removing) tag(s) to a set of file results in multiple renaming
processes. Despite advanced renaming tools like vidir (from
[[http://joeyh.name/code/moreutils/][moreutils]]) it's handy to have a tool that makes adding and removing
tags as simple as possible.
You may like to add this tool to your image or file manager of
choice. I added mine to [[http://geeqie.sourceforge.net/][geeqie]] which is my favorite image viewer on
GNU/Linux.
[[https://media.ccc.de/v/GLT18_-321-en-g_ap147_004-201804281550-the_advantages_of_file_name_conventions_and_tagging-_karl_voit/][bin/2018-05-06 filetags demo slide for video preview with video button -- screenshots.png]]
If you have installed Python 2 and Python 3 in parallel, make sure to
use the correct pip version. You might need to use =pip3= instead of
=pip=. If you only have Python 3 installed, you don't have to care ;-)
This tool adds or removes simple tags to/from file names.
Tags within file names are placed between the actual file name and
the file extension, separated with " -- ". Multiple tags are
separated with " ":
Update for the Boss -- projectA presentation.pptx
2013-05-16T15.31.42 Error message -- screenshot projectB.png
This easy to use tag system has a drawback: for tagging a larger
set of files with the same tag, you have to rename each file
separately. With this tool, this only requires one step.
Example usages:
filetags --tags="presentation projectA" *.pptx
… adds the tags "presentation" and "projectA" to all PPTX-files
filetags --tags="presentation -projectA" *.pptx
… adds the tag "presentation" to and removes tag "projectA" from all PPTX-files
filetags -i *
… ask for tag(s) and add them to all files in current folder
filetags -r draft report
… removes the tag "draft" from all files containing the word "report"
This tools is looking for the optional first text file named ".filetags" in
current and parent directories. Each of its lines is interpreted as a tag
for tag completion. Multiple tags per line are considered mutual exclusive.
positional arguments:
FILE One or more files to tag
optional arguments:
-h, --help show this help message and exit
-t "STRING WITH TAGS", --tags "STRING WITH TAGS"
One or more tags (in quotes, separated by spaces) to
add/remove
--remove Remove tags from (instead of adding to) file name(s)
-i, --interactive Interactive mode: ask for (a)dding or (r)emoving and
name of tag(s)
-R, --recursive Recursively go through the current directory and all
of its subdirectories. Implemented for --tag-gardening
and --tagtrees
-s, --dryrun Enable dryrun mode: just simulate what would happen,
do not modify files
--hardlinks Use hard links instead of symbolic links. This is
ignored on Windows systems. Note that renaming link
originals when tagging does not work with hardlinks.
-f, --filter Ask for list of tags and generate links in
"$HOME/.filetags_tagfilter" containing links to all
files with matching tags and start the filebrowser.
Target directory can be overridden by --tagtrees-dir.
--filebrowser PATH_TO_FILEBROWSER
Use this option to override the tool to view/manage
files (for --filter; default: geeqie). Use "none" to
omit the default one.
--tagtrees This generates nested directories in
"$HOME/.filetags_tagfilter" for each combination of
tags up to a limit of 2. Target directory can be
overridden by --tagtrees-dir. Please note that this
may take long since it relates exponentially to the
number of tags involved. Can be combined with
--filter. See also http://Karl-Voit.at/tagstore/ and
http://Karl-Voit.at/tagstore/downloads/Voit2012b.pdf
--tagtrees-handle-no-tag "treeroot" | "ignore" | "FOLDERNAME"
When tagtrees are created, this parameter defines how
to handle items that got no tag at all. The value
"treeroot" is the default behavior: items without a
tag are linked to the tagtrees root. The value
"ignore" will not link any non-tagged items at all.
Any other value is interpreted as a folder name within
the tagreees which is used to link all non-tagged
items to.
--tagtrees-link-missing-mutual-tagged-items
When the controlled vocabulary holds mutual exclusive
tags (multiple tags in one line) this option generates
directories in the tagtrees root that hold links to
items that have no single tag from those mutual
exclusive sets. For example, when "draft final" is
defined in the vocabulary, all items without "draft"
and "final" are linked to the "no-draft-final"
directory.
--tagtrees-dir <existing_directory>
When tagtrees are created, this parameter overrides
the default target directory
"$HOME/.filetags_tagfilter" with a user-defined
one. It has to be an empty directory or a non-existing
directory which will be created. This also overrides
the default directory for --filter.
--tagtrees-depth TAGTREES_DEPTH
When tagtrees are created, this parameter defines the
level of depth of the tagtree hierarchy. The default
value is 2. Please note that increasing the depth
increases the number of links exponentially.
Especially when running Windows (using lnk-files
instead of symbolic links) the performance is really
slow. Choose wisely.
--ln, --list-tags-by-number
List all file-tags sorted by their number of use
--la, --list-tags-by-alphabet
List all file-tags sorted by their name
--lu, --list-tags-unknown-to-vocabulary
List all file-tags which are found in file names but
are not part of .filetags
--tag-gardening This is for getting an overview on tags that might
require to be renamed (typos, singular/plural, ...).
See also http://www.webology.org/2008/v5n3/a58.html
-v, --verbose Enable verbose mode
-q, --quiet Enable quiet mode
--version Display version and exit
: filetags --tags foo a_file_name.txt
... adds tag "foo" such that it results in a_file_name -- foo.txt
: filetags -i *.jpeg
... interactive mode: asking for list of tags (for the JPEG files) from the user
: filetags --tags "foo bar" "file name 1.jpg" "file name 2 -- foo.txt" "file name 3 -- bar.csv"
... adds tag "foo" such that it results in ...
: "file name 1 -- foo bar.jpg"
: "file name 2 -- foo bar.txt"
: "file name 3 -- bar foo.csv"
: filetags --remove --tags foo "foo a_file_name -- foo.txt"
... removes tag "foo" such that it results in foo a_file_name.txt
: filetags --tag-gardening
... prints out a summary of tags in current and sub-folders used and
tags that are most likely typos or abandoned
For =--filter= and =--tagtrees= examples see sections below.
Independent to tags you might define on the fly, the optional file
.filetags stores a controlled vocabulary of recurrent tags; adjust
this content to your needs. In an interactive session, this set is
available to tag any file in the folder .filetags resides (click tab
key) and propagates into folders of lower hierachy.
Example: when filetags shows you =Top nine previously used tags in
this directory:= with =mytag(1) anothertag(2) oncemore(3)=, you
don't have to type in the tag names but use the numbers instead.
Combinations of numbers are fine as well.
2016-08-26: =--filter= option requires /all/ tags to be matching
2016-10-15: added tag gardening: vocabulary tags not used + tags not
in vocabulary
2016-10-16: interactively adding tags: omit already assigned tags in
shortcuts and vocabulary
2016-11-27: added existing shared tags to visual tags
2017-02-06: better help text for =--filter= option
2017-02-25: shortcut tags can be mixed with non-shortcut tags
Example: =mytag 49 anothertag= does add tags =mytag= and
=anothertag= and the shortcut tags =4= and =9=
2017-04-09:
interactively removing tags via =-tagname=:
Example: the tag input =tagname -removeme= adds the tag
=tagname= and removes the tag =removeme= from the filename(s)
try to find alternative filename if file not found
Example: if you try to tag file =My file name.pdf= which is not
found, filetags tries to look for a different (unique and
existing) filename that shares the same start of the file name
such as =My file name -- mytag.pdf=. Very handy!
This happens a lof when you are interactively adding multiple
tags one by one by simply re-executing the previous command
line: the file name changes in between because of the previous
tag(s) being added.
2017-08-27: when tagging symbolic links whose source file has a
matching file name, the source file gets the same tags as the
symbolic link of it
This is especially useful when using the =--filter= option
2021-04-03: added support for =#donotsuggest= lines within =.filetags= files to omit tags from being proposed
** Get the most out of filetags: controlled vocabulary .filetags
:PROPERTIES:
:ID: 2018-07-08-cv
:CREATED: [2015-01-02 Fri 17:12]
:END:
This awesome tool is providing support for [[https://en.wikipedia.org/wiki/Controlled_vocabulary][controlled vocabularies]].
When invoked for interactive tagging, it is looking for files named
.filetags in the current working directory and its parent
directories as well. The first file of this name found is read in.
Each line represents one tag. Those tags are used for tag
completion.
This is purely great: with tags within .filetags you don't have to
enter the tags entrirely: just type the first characters and press =TAB=
(twice to show you all possibilities). You will be amazed how
efficiently you are going to tag things! :-)
Of course, you can remove existing tags by prepending a =-= character
to the tag: =-tagname=. This also works interactively using the tab
completion feature.
You can use comments in =.filetags= files: everything after a =#=
character is considered a comment. You can even add a comment after a
tag like "=mytag # this is a test tag=".
If you do use tags you do not want to get proposed for tagging, you
can write them in lines like the following ones to omit their proposal
(case insensitive):
If you enter multiple tags in the same line in .filetags, they are
interpreted as mutually exclusive tags. For example, if your
.filetags contains the line winter spring summer autumn, filetags
replaces any season-tag with the new one. So if you tag the file …
: example file -- summer anothertag.txt
… with the tag winter, it gets renamed to …
: example file -- winter anothertag.txt
… without having to manually remove the tag summer.
Common mutually exclusive tags are =draft final= or =confidential
internal public=.
Consider you have a directory that contains hundreds of files.
If you want to retrieve a file whose tags you know, you can skim
through all the files. However, filetags offers you a more elegant
possibility: you can filter the files according to one or more tags.
For example, we take a look at following situation:
: $HOME/my party/
: |_ 2018-06-25 Party invitation -- scan correspondence.pdf
: |_ 2018-07-31 Guest list -- correspondence.txt
: |_ 2018-08-01T11.51.44 Uncle Bob arrives.jpg
: |_ 2018-08-01T12.31.42 Sheila with her new boyfriend -- friends.jpg
: |_ 2018-08-01T14.12.23 Start of BBQ with the big steak.jpg
: |_ ...
: |_ 2018-08-01T23.53.19 Even uncle Bob desides to go home -- fun.jpg
: |_ 2018-08-05 Lessons learned for planning a party -- scan.pdf
: |_ 2018-08-06 Thank-you letter Bob -- scan.pdf
: |_ Bills/
: |_ 2018-07-30 Beverages by FreshYouUp -- scan taxes.pdf
: |_ 2018-08-03 Bill of the butcher -- scan taxes.pdf
Following command and interaction would generate following temporal
link structure:
: filetags --filter
User gets asked to enter one or more tags and she enters "scan". What
now happens is that filetags creates a directory whose content
consists of links to all matching files from your query. By default,
the resulting directory is =.filetags_tagfilter= in your home
directory. After invoking for our example, the content of this
retrieval directory looks like that:
: $HOME/.filetags_tagfilter/
: |_ 2018-06-25 Party invitation -- scan correspondence.pdf
: |_ 2018-08-05 Lessons learned for planning a party -- scan.pdf
: |_ 2018-08-06 Thank-you letter Bob -- scan.pdf
This way, our user is quickly able to skim through all scanned
documents to locate the one desired to retrieve.
To locate all matching files in all sub-directories as well, the user
is able to add the parameter =--recursive= ...
: filetags --filter --recursive
... and chooses to enter the tag "scan" which would generate following
temporal link structure:
: $HOME/.filetags_tagfilter/
: |_ 2018-06-25 Party invitation -- scan correspondence.pdf
: |_ 2018-08-05 Lessons learned for planning a party -- scan.pdf
: |_ 2018-08-06 Thank-you letter Bob -- scan.pdf
: |_ 2018-07-30 Beverages by FreshYouUp -- scan taxes.pdf
: |_ 2018-08-03 Bill of the butcher -- scan taxes.pdf
This functions is somewhat sophisticated as it is not a very
well-known thing to have. If you're really interested in the whole
story behind the visualization/navigation of tags using TagTrees, feel
free to read [[http://Karl-Voit.at/tagstore/downloads/Voit2012b.pdf][my PhD thesis]] about it on [[http://Karl-Voit.at/tagstore/][the tagstore webpage]]. It is
surely a piece of work I am proud of and the general chapters of it
are written so that the average person is perfectly well able to
follow.
In short: this function takes the files of the current directory and
generates hierarchies up to level of =$maxdepth= (by default 2, can be
overridden via =--tagtrees-depth=) of all combinations of tags,
[[https://en.wikipedia.org/wiki/Symbolic_link][linking]] all files according to their tags.
Too complicated? Then let's explain it with some examples.
Consider having a file like:
: My new car -- car hardware expensive.jpg
Now you generate the TagTrees, you'll find [[https://en.wikipedia.org/wiki/Symbolic_link][links]] to this file within
sub-directories of =~/.filetags=, the default target directory: =car/=
and =hardware/= and =expensive/= and =car/hardware/= and
=car/expensive/= and =hardware/car/= and so on. You get the idea.
The default target directory can be overridden via =--tagtrees-dir=.
Therefore, within the folder =new/expensive/= you will find all files
that have at least the tags "new" and "expensive" in any order. This
is /really/ cool to have.
Files of the current directory that don't have any tag at all, are
linked directly to =~/.filetags= so that you can find and tag them
easily.
I personally, do use this feature within my image viewer of choice
([[http://geeqie.sourceforge.net/][geeqie]]). I mapped it to =Alt-T= because =Alt-t= is occupied by
=filetags= for tagging of course. So when I am within my image viewer
and I press =Alt-T=, TagTrees of the currently shown images are
created. Then an additional image viewer window opens up for me,
showing the resulting TagTrees. This way, I can quickly navigate
through the tag combinations to easily interactively filter according
to tags.
Please note: when you are tagging linked files within the TagTrees
with filetags, only the current link gets updated with the new name.
All other links to this modified filename within the other directories
of the TagTrees gets broken. You have to re-create the TagTrees to
update all the links after tagging files.
The option =--tagtrees-handle-no-tag= controls how files with no tags
should be handled. When set to =treeroot=, untagged files are linked
in the TagTrees target directory directly. The option =ignore= does
not link them at all. The option =FOLDERNAME= links them to a
directory named accordingly to the value which is a sub-directory of
the TagTrees target directory.
With the option =--tagtrees-link-missing-mutual-tagged-items= you can
control, whether or not there will be an additional TagTrees folder
that contains all files which lack one of the mutually exclusive tags.
Using the example winter spring summer autumn from above, all files
that got none of those four tags get linked to a TagTrees directory
named "no_winter_spring_summer_autumn". This way, you can easily find
and tag files that don't participate in this set of mutually exclusive
tags.
Using the example files from above:
: $HOME/my party/
: |_ 2018-06-25 Party invitation -- scan correspondence.pdf
: |_ 2018-07-31 Guest list -- correspondence.txt
: |_ 2018-08-01T11.51.44 Uncle Bob arrives.jpg
: |_ 2018-08-01T12.31.42 Sheila with her new boyfriend -- friends.jpg
: |_ 2018-08-01T14.12.23 Start of BBQ with the big steak.jpg
: |_ ...
: |_ 2018-08-01T23.53.19 Even uncle Bob desides to go home -- fun.jpg
: |_ 2018-08-05 Lessons learned for planning a party -- scan.pdf
: |_ 2018-08-06 Thank-you letter Bob -- scan.pdf
: |_ Bills/
: |_ 2018-07-30 Beverages by FreshYouUp -- scan taxes.pdf
: |_ 2018-08-03 Bill of the butcher -- scan taxes.pdf
... filetags generates the temporal link structure:
: $HOME/.filetags_tagfilter/
: |_ scan/
: |_ 2018-06-25 Party invitation -- scan correspondence.pdf
: |_ 2018-08-05 Lessons learned for planning a party -- scan.pdf
: |_ 2018-08-06 Thank-you letter Bob -- scan.pdf
: |_ 2018-07-30 Beverages by FreshYouUp -- scan taxes.pdf
: |_ 2018-08-03 Bill of the butcher -- scan taxes.pdf
: |_ correspondence/
: |_ 2018-06-25 Party invitation -- scan correspondence.pdf
: |_ taxes/
: |_ 2018-07-30 Beverages by FreshYouUp -- scan taxes.pdf
: |_ 2018-08-03 Bill of the butcher -- scan taxes.pdf
: |_ correspondence/
: |_ 2018-06-25 Party invitation -- scan correspondence.pdf
: |_ 2018-07-31 Guest list -- correspondence.txt
: |_ scan/
: |_ 2018-06-25 Party invitation -- scan correspondence.pdf
: |_ friends/
: |_ 2018-08-01T12.31.42 Sheila with her new boyfriend -- friends.jpg
: |_ fun/
: |_ 2018-08-01T23.53.19 Even uncle Bob desides to go home -- fun.jpg
: |_ taxes/
: |_ 2018-07-30 Beverages by FreshYouUp -- scan taxes.pdf
: |_ 2018-08-03 Bill of the butcher -- scan taxes.pdf
: |_ scan/
: |_ 2018-07-30 Beverages by FreshYouUp -- scan taxes.pdf
: |_ 2018-08-03 Bill of the butcher -- scan taxes.pdf
: |_ has_no_tag/
: |_ 2018-08-01T11.51.44 Uncle Bob arrives.jpg
: |_ 2018-08-01T14.12.23 Start of BBQ with the big steak.jpg
: |_ ...
This looks complicated because there are many links generated the user
does not really need. The beauty of this solution is that the user is
able to navigate to a file using a wide set of different paths (the
TagTrees) and she is able to choose the one path that suits the
current cognitive model.
For example, she might want to retrieve "the one document from the
last party which she remembers of having scanned and which she used
for the invitation correspondence". With this mind-set, she most
likely retrieves the document via
=$HOME/.filetags_tagfilter/scan/correspondence/= or
=$HOME/.filetags_tagfilter/correspondence/scan/= (does not matter
which).
The large number of other TagTrees can be ignored for this retrieval
task.
Another retrieval task example would be "all photos that do have no
tag in order to continue tagging the photos". In this example, the
user visits =$HOME/.filetags_tagfilter/has_no_tag/=, fires her image
viewer (which has filetags integrated already - see below) and
continues with the tagging activity. Since filetags synchronizes the
tags within TagTrees linked files and the original files, the original
files get renamed accordingly.
** Bonus: Using tags to specify a sub-set of photographs
:PROPERTIES:
:ID: 2018-07-08-sel-photos
:END:
You know the problem: got back from Paris and you can not show 937
image files to your friends. It's just too much.
My solution: I tag to define selections. For example, I am using sel
("selection") for the ultimate cool photographs using filetags, of
course.
Within geeqie, which is my preferred image viewer, I redefined F to
call filetags with its =--filter= parameter. Now I get asked to enter
one or more tags to filter the current folder. For presenting only the
files that were tagged with sel, I enter sel and confirm with
Enter.
This creates a temporary folder with symbolic links to all photographs
of the current folder that contain the tag sel and it starts a new
(additional) instance of geeqie.
In short: after returning from a trip, I mark all "cool" photographs
within geeqie, choose t and tag them with sel (described in
previous section). For showing only sel images, I just press F,
enter sel and instead of 937 photographs, my friends just have to
watch the best 50 or so. :-)
Watch [[https://media.ccc.de/v/GLT18_-321-en-g_ap147_004-201804281550-the_advantages_of_file_name_conventions_and_tagging-_karl_voit][this 45 minute talk]] on how I am using this (and other) features.
Integration Into Common Tools
If your system has Python 3 installed, you can start using filetags
right away in any command line environment.
However, users do want to integrate tools like filetags also in
various GUI tools.
The [[file:Integration.org][Integration.org file]] explains integration in some tools that allow
external commands being added:
If you have integrated filetags in additional commonly used tools,
please send me a short how-to so that others are able to get the most
out of filetags as well.
Related Tools and Workflows
This tool is part of a tool-set which I use to manage my digital files
such as photographs. My work-flows are described in [[http://karl-voit.at/managing-digital-photographs/][this blog posting]]
you might like to read and in the video which is linked above.
With [[https://github.com/protesilaos/denote][denote]], Protesilaos
Stavrou implemented a conceptually related approach to manage notes
within an Emacs buffer. With dired, this method equally may be
applied on files, too.
If you do like filetags but you prefer the syntax of [[https://www.tagspaces.org/][TagSpaces]] for
adding tags to file names, you should check out [[https://github.com/jgru/filetags][this filetags fork]].
Maintenance is limited though. Please notice that my other tools
working with tags do not support TagSpaces-style either.
Exhaustive List of All Features
:PROPERTIES:
:CREATED: [2018-07-08 Sun 13:09]
:END:
This section is an exhaustive list of features of =filetags=. You
might skip this when you're a first-time user in order not to get
irritated for simple use-cases only.
This section is particularily helpful for re-implementing =filetags=
functionality and for power-users which are interested in the advanced
functions provided by this tool.
** General
| Before | When | After | Note |
|----------------------------------+--------------------+----------------------------------+--------------------------------------------|
| =Some file name.jpeg= | tagging with =foo= | =Some file name -- foo.jpeg= | Tag separator is added automatically |
| =Some file name= | tagging with =foo= | =Some file name -- foo= | There is no need for a file extension |
| =Some file name -- foo.jpeg= | tagging with =bar= | =Some file name -- foo bar.jpeg= | =bar= becomes last tag |
| =Some file name.jpeg.lnk= | tagging with =bar= | =Some file name -- bar.jpeg.lnk= | The =.lnk= extension is taken into account |
| =Some file name -- bar.jpeg= | untagging =bar= | =Some file name.jpeg= | Tag separator is removed |
| =Some file name -- foo bar.jpeg= | untagging =foo= | =Some file name -- bar.jpeg= | Tag order stays same when removing |
=filetags= may be used
interactively (via =--interactive= or missing "action" command
line parameters) from command line or
in a script using command line parameters.
=filetags= offers a =--dryrun= option which does not modify any file
or directory.
Added tag(s) get appended as last tag(s).
When removing tags, their relative order is preserved.
When modifying any file that is a symbolic link or a Windows =.LNK=
file to a file that has the same basename (file name without path),
the linked/original file gets modified as well.
This comes very handy when working within TagTrees (see below).
However, when modifying links which do not share the same
base-name with its link source, the link might become a broken one
(depending on the link technology used).
When un-tagging tags from files that do not have those tags, it is silently ignored.
FIXXME: describe =find_unique_alternative_to_file(filename)= and implications
=.filetags= files may be links (hardlinks, symbolic links or even Windows =.LNK= files)
Comments within =.filetags= files begin with one or more =#= characters that may be prepended by one or more spaces.
You can omit (case insensitive) tags from being proposed (selectable
via shortcuts =0-9=) by adding special comment lines like:
: #donotsuggest omit-this-tag dontshow
: #donotsuggest wontpropose
PLANNED: =.filetags= files may include other =.filetags= files via =#include =
Just invoke =filetags --tag-gardening= or =filetags --recursive
--tag-gardening= and read its output to learn about helpful analysis
results to curate your tags. My personal favorites are:
I am able to find typos in tags (tag count is low and similar tags are found).
I can determine tags I seldom use and therefore might be removed from CVs.
Statistics on tag usage like, e.g.:
Distribution of mutually exclusive tag options.
Fraction of files that are not tagged.
Tags I have used which are not in my CVs.
Unused tags.
This feature is really powerful when it comes to maintenance of your
file tags or get some insight related to your tagging patterns.
Local Variables :noexport:
Local Variables:
mode: auto-fill
mode: flyspell
eval: (ispell-change-dictionary "en_US")
End:
FAQs
Management of simple tags within file names
We found that filetags demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago.It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.