Research
Security News
Malicious npm Package Targets Solana Developers and Hijacks Funds
A malicious npm package targets Solana developers, rerouting funds in 2% of transactions to a hardcoded address.
collective.documentviewer
Advanced tools
collective.documentviewer
integrates DocumentCloud
_ viewer and PDF processing
into Plone
_.
You can be seen in action the functionality that implements this add-on at the following sites:
Very nice document viewer.
OCR.
Searchable on OCR text.
Works with many different document types.
collective.celery
_ integration.
Lots of configuration options.
PDF Album view for display groups of PDFs.
Besides displaying PDFs, it will also display:
Word.
Excel.
Powerpoint.
HTML.
RTF.
This product has been translated into
German.
Spanish.
Basque.
French.
Italian.
Dutch.
Simplified Chinese.
You can contribute for any message missing or other new languages, join us at
Plone Collective Team <https://www.transifex.com/plone/plone-collective/>
_
into Transifex.net service with all world Plone translators community.
GraphicsMagick.
ghostscript (version 9.0 preferred).
Poppler
tesseract (optional)
qpdf
OpenOffice or LibreOffice (optional, for doc, excel, ppt, etc. types)
md5 or md5sum command line tool.
Special instructions for CentOS have been contributed by Eric Tyrer.
You can access them via the git hub repo file location <https://github.com/collective/collective.documentviewer/blob/master/CENTOS-INSTALL.rst>
_.
Special instructions for Debian have been contributed by Leonardo J. Caballero G.
You can access them via the git hub repo file location <https://github.com/collective/collective.documentviewer/blob/master/DEBIAN-INSTALL.rst>
_.
If on a Linux/Ubuntu/Debian machine you run into an error like::
/var/lib/gems/1.9.1/gems/docsplit-0.7.2/lib/docsplit/image_extractor.rb:51:in `exists?': can't convert nil into String (TypeError)
from /var/lib/gems/1.9.1/gems/docsplit-0.7.2/lib/docsplit/image_extractor.rb:51:in `ensure in convert'
This is because the ruby docsplit library is having an issue with the temp folder accesses, and removal of temp files. Just run the following command::
sudo chmod 1777 /tmp && sudo chmod 1777 /var/tmp
And retry the conversion of your document
Normal flow: ::
git clone git@github.com:collective/collective.documentviewer.git
cd collective.documentviewer
virtualenv .
bin/pip install -r requirements.txt
bin/buildout
It it highly recommended to install and configure collective.celery
_
in combination with this package. Doing so will manage all PDF
conversions processes asynchronously so the user isn't delayed
so much when saving files.
The product can be configured via a control panel item
Document Viewer Settings
.
Some interesting configuration options:
Storage Type
If you want to be able to serve you files via Amazon Cloud,
this will allow you to store the data in flat files that
can be synced to another server.
Storage Location
Where are the server to store the files.
OCR
Use tesseract
to scan the document for text. This process can be
slow so if your PDFs do not need to be OCR'd, you may disable.
Auto Select Layout
For PDF files added to the site, automatically select the
document viewer display.
Auto Convert
When PDF files are added and modified, automatically convert.
Auto layout file types
Types that should automatically be converted to document viewer.
If you want to use it with your own Dexterity content type. You need to edit
the FTI
in ZMI/portal_types/yourtype
to add "documentviewer" in
the available view methods like this: ::
<property name="view_methods" purge="False">
<element value="documentviewer"/>
</property>
Also you need to set the primary field in the schema, for example: ::
<field name="myfile" marshal:primary="true"
type="plone.namedfile.field.NamedBlobFile">
If you choose to use basic file storage instead of ZODB blob storage, there are a few things you'll want to keep in mind.
Use Nginx
_ to then serve the file system files. This might require
you install a local Nginx just for serving file storage on the
Plone server. You can get creative with how your file storage
is used though.
Since in Plone's delete operation, it can be interrupted and the deletion
of a file on the OS system system can not be done within a transaction,
no files are ever deleted. However, there is an action you can
put in a cron
_ task to clean up your file storage directory. Just call the
url http://zeoinstace/plone/@@dvcleanup-filestorage
.
If you currently have page turner installed, this project will supercede it. Your page turner views will work but no future files added to the site will be converted to page turner.
To convert existing view, on every page turner enabled file, there will
be a button Document Viewer Convert
that you can click to manually
convert page turner to document viewer.
To convert all existing views, go to portal_setup
in the ZMI, upgrades,
select collective.documentviewer
, click to show old upgrades and there
should be an upgrade-all
step to run.
This add-on is tested using Travis CI. The current status of the add-on is:
.. image:: https://travis-ci.org/collective/collective.documentviewer.svg?branch=master :alt: Travis CI badge :target: https://travis-ci.org/collective/collective.documentviewer
.. image:: http://img.shields.io/pypi/v/collective.documentviewer.svg :alt: PyPI badge :target: https://pypi.org/project/collective.documentviewer
Have an idea? Found a bug? Let us know by opening a ticket
_.
This product was developed by Wildcard Corp. team.
.. image:: https://raw.githubusercontent.com/collective/collective.documentviewer/i18n_improvements/docs/_static/wildcardcorp_logo.png :height: 111px :width: 330px :alt: Produced by wildcardcorp.com :align: right
The project is licensed under the GPLv2.
.. _DocumentCloud: https://www.documentcloud.org/
.. _Plone: https://plone.org/
.. _collective.celery: https://pypi.org/project/collective.celery/
.. _Nginx: https://nginx.org/
.. _cron: https://crontab.guru/
.. _opening a ticket
: https://github.com/collective/collective.documentviewer/issues
DOCUMENTVIEWER_QPDF_PARAMETERS
environment variable
[mpeeters]Added Transifex.net service integration to manage the translation process. [macagua]
Updated Spanish translation. [macagua]
Updated the i18n support. [macagua]
Fix the download link for the document. #78 [b4oshany]
Replaced docsplit
. Instead call the various packages directly.
See pull request #79 <https://github.com/collective/collective.documentviewer/pull/79>
_.
[alphaomega325]
Python 3, Plone 5.2 compatible [vangheem]
plone.api.portal.get
instead of getToolByName
[vangheem]Fix to work with latest collective.celery [vangheem]
Fix issue breaking zoom on the 1st page of PDFs [obct537]
Add function and browser view (convert_all_unconverted
) to convert all files, which haven't been converted yet.
[thet]
Do not break if no global request is set. Fixes #71 [ale-rt]
Fix redundant condition [ale-rt]
Handle plone.app.contenttypes file indexing. [thet]
Add a custom migrator for plone.app.contenttypes and avoid converting while migrating to plone.app.contettypes. [thet]
Added support for libreoffice under Nixos, which uses a different folder name for its conversion directories [pysailor]
Added italian translation [keul]
Fixed JavaScript issue on Chrome: expected global variable sidebar
was not global
[keul]
handle conflict errors in async processes better [vangheem]
Handle file deleted to clean up files [vangheem]
fix not being able to hide sidebar [vangheem]
do not convert Image types [vangheem]
be able to completely hide contributor [vangheem]
add lead image support [vangheem]
be able to use collective.celery for queuing tasks [vangheem]
fix async monitor registration [pilz]
fix Plone 5 compatibility [vangheem]
upgrade jquery.imgareaselect to latest [vangheem]
upgrade document viewer to latest [vangheem]
do not support upgrading from wildcard.pdfpal and wc.pageturner anymore. Use 3.x series [vangheem]
Add Dexterity compatibility. To enable it on your content type, you have to define a primary field and add documentviewer in the available view methods, see documentation. [vincentfretin]
Fix: users that can modify can now view info messages and 'annotations'/'sections' feature. [thomasdesvenain]
Show contributor fullname if possible. Contributor and organization are in a span. [thomasdesvenain]
Avoid replacing non-ascii characters by (?) during OCR process for non english languages. [thomasdesvenain]
Plain text indexation is fixed for non converted contents. [thomasdesvenain]
When a new release of the document is currently generated, user is notified by a status message. [thomasdesvenain]
i18n fixes + french translations [thomasdesvenain]
support to pass a document language to tesseract/docsplit based on a configurable adapter implementing IOCRLanguage [ajung]
added french translations [gbastien]
added enable_indexation parameter in global and local settings Fixes : https://github.com/collective/collective.documentviewer/issues/21 [gbastien]
make local settings coherent regarding global settings Fixes : https://github.com/collective/collective.documentviewer/issues/22 [gbastien]
fix use with latest libreoffice and docsplit. Fixes: https://github.com/collective/collective.documentviewer/issues/11
do not require docsplit to be installed on the plone instance in order to display the viewer. In case the document was converted on another client. [vangheem]
switch to using OFS.interfaces.IFolder for folder view [vangheem]
while pdf is converting, show existing if available. [vangheem]
move convert button to actions [vangheem]
test for Plone 4.2 compatibility. [hvelarde]
work with subsites
fix cleaning file location
fix potential tranversal error for file resources
include contentmenu zcml dependency
upgrade conversion will now try and fix error'd conversions
be able to move jobs to front of queue
use portal_catalog instead of uid_catalog so security checks apply to resource urls.
create local catalog and index before syncing db to prevent conflict errors.
add redirect timeout to conversion info page
make sure to close open file descriptors
Change "Original Document (PDF)" to "Original Document"
emit event after conversion
only show queue link if manager
convert button should work for files that do not have layout selected yet
use communicate instead of wait with popen in case output is large. Prevents deadlocks.
do not assume pdfpal is used along with pageturner on data conversion.
better command runner
track errors better and display them in interface if something happened during conversion
new file storage structure to prevent too many files from being in one directory
fix full screen button when text or pages selected.
be able to customize batch size
default OCR to being off since it's pretty slow
better logging when looking for binary files
be able to override width of viewer
make sure to initialize catalog after db sync for large PDFs. [vangheem]
better integrate with pdfpal and pageturner so it's easy to upgrade from those products. [vangheem]
fix setting custom quota for async queue [vangheem]
fix group view clear button [vangheem]
add support for alternative md5sum binary [vangheem]
fix full screen page bug [vangheem]
better async integration with quota setting [vangheem]
View async queue for conversions [vangheem]
index ocr data in portal catalog [vangheem]
better pdf group view with search [vangheem]
handle large files better [vangheem]
check if file has already been converted by storing hash of the file to check against. [vangheem]
be able to remove document viewer conversion tasks [vangheem]
add ability to cleanup file storage files for deleted plone File objects. [vangheem]
add pdf folder album view [vangheem]
fix async integration [vangheem]
add control panel icon [vangheem]
fix uninstall procedure [vangheem]
changing image type does not cause existing ones to fail. [vangheem]
FAQs
Document cloud's document viewer integration into plone.
We found that collective.documentviewer demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 10 open source maintainers collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Research
Security News
A malicious npm package targets Solana developers, rerouting funds in 2% of transactions to a hardcoded address.
Security News
Research
Socket researchers have discovered malicious npm packages targeting crypto developers, stealing credentials and wallet data using spyware delivered through typosquats of popular cryptographic libraries.
Security News
Socket's package search now displays weekly downloads for npm packages, helping developers quickly assess popularity and make more informed decisions.