![Oracle Drags Its Feet in the JavaScript Trademark Dispute](https://cdn.sanity.io/images/cgdhsj6q/production/919c3b22c24f93884c548d60cbb338e819ff2435-1024x1024.webp?w=400&fit=max&auto=format)
Security News
Oracle Drags Its Feet in the JavaScript Trademark Dispute
Oracle seeks to dismiss fraud claims in the JavaScript trademark dispute, delaying the case and avoiding questions about its right to the name.
A python wrapper to extract text from images on a mac system. Uses the vision framework from Apple.
A small Python wrapper to extract text from images on a Mac system. Uses the vision framework from Apple. Simply pass a path to an image or a PIL
image directly and get lists of texts, their confidence, and bounding box.
This only works on macOS systems with newer macOS versions (10.15+).
Install via pip:
pip install ocrmac
from ocrmac import ocrmac
annotations = ocrmac.OCR('test.png').recognize()
print(annotations)
Output (Text, Confidence, BoundingBox):
[("GitHub: Let's build from here - X", 0.5, [0.16, 0.91, 0.17, 0.01]),
('github.com', 0.5, [0.174, 0.87, 0.06, 0.01]),
('Qi &0 O M #O', 0.30, [0.65, 0.87, 0.23, 0.02]),
[...]
('P&G U TELUS', 0.5, [0.64, 0.16, 0.22, 0.03])]
(BoundingBox precision capped for readability reasons)
from ocrmac import ocrmac
ocrmac.OCR('test.png').annotate_PIL()
ocrmac.OCR
) or function ocrmac.text_from_image
)recognition_level
: fast
or accurate
language_preference
: A list with languages for post-processing, e.g. ['en-US', 'zh-Hans', 'de-DE']
.annotate_PIL
) or matplotlib figure (annotate_matplotlib
)vision
or the livetext
framework as backend.You can set a language preference like so:
ocrmac.OCR('test.png',language_preference=['en-US'])
What abbreviation should you use for your language of choice? Here is an overview of language codes, e.g.: Chinese (Simplified)
-> zh-Hans
, English
-> en-US
..
If you set a wrong language you will see an error message showing the languages available. Note that the recognition_level
will affect the languages available (fast has fewer)
See also this Example Notebook for implementation details.
Timings for the above recognize-statement: MacBook Pro (Apple M3 Max):
accurate
: 207 ms ± 1.49 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)fast
: 131 ms ± 702 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)livetext
: 174 ms ± 4.12 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)Since MacOS Sonoma, LiveText
is now supported, which is stronger than the VisionKit
OCR. You can try this feature by:
# Use the OCR class
from ocrmac import ocrmac
annotations = ocrmac.OCR('test.png', framework="livetext").recognize()
print(annotations)
# Or use the helper directly
annotations = ocrmac.livetext_from_image('test.png').recognize()
Notice, when using this feature, the recognition_level
and confidence_threshold
are not available. The confidence
output will always be 1.
If you want to do Optical character recognition (OCR) with Python, widely used tools are pytesseract
or EasyOCR
. For me, tesseract never did give great results. EasyOCR did, but it is slow on CPU. While there is GPU acceleration with CUDA, this does not work for Mac. (Update from 9/2023: Apparently EasyOCR now has mps support for Mac.)
In any case, as a Mac user you might notice that you can, with newer versions, directly copy and paste from images. The built-in OCR functionality is quite good. The underlying functionality for this is VNRecognizeTextRequest
from Apple's Vision Framework. Unfortunately it is in Swift; luckily, a wrapper for this exists. pyobjc-framework-Vision
. ocrmac
utilizes this wrapper and provides an easy interface to use this for OCR.
I found the following resources very helpful when implementing this:
I also did a small writeup about OCR on mac in this blogpost on medium.com.
If you have a feature request or a bug report, please post it either as an idea in the discussions or as an issue on the GitHub issue tracker. If you want to contribute, put a PR for it. Thanks!
If you like the project, consider starring it!
FAQs
A python wrapper to extract text from images on a mac system. Uses the vision framework from Apple.
We found that ocrmac demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Oracle seeks to dismiss fraud claims in the JavaScript trademark dispute, delaying the case and avoiding questions about its right to the name.
Security News
The Linux Foundation is warning open source developers that compliance with global sanctions is mandatory, highlighting legal risks and restrictions on contributions.
Security News
Maven Central now validates Sigstore signatures, making it easier for developers to verify the provenance of Java packages.