Security News
Research
Data Theft Repackaged: A Case Study in Malicious Wrapper Packages on npm
The Socket Research Team breaks down a malicious wrapper package that uses obfuscation to harvest credentials and exfiltrate sensitive data.
github.com/ranghetto/go_ocr_space
OCR library is based on OCR.Space API, meant to read text from pdf files and images with Go language
You can get your personal API key here (we will need this later)
go get -t github.com/ranghetto/go_ocr_space
Delete the folder situated in $GO_PATH/src/github.com/ranghetto/go_ocr_space and then run this command:
go get -t github.com/ranghetto/go_ocr_space
You need at first to create a configuration:
package main
import(
/*
Remember to run your program from your $GO_PATH/src/name_of_your_folder
or to provide the right path to this library that is situated
in $GO_PATH/src/github.com/ranghetto/go_ocr_space
*/
ocr "github.com/ranghetto/go_ocr_space"
//Other libraries...
)
func main(){
config := ocr.InitConfig("yourApiKeyHere", "eng", OCREngine2)
//More code here...
}
The first parameter is your API key as a string, the second one is the code of the language you want read from file or image. Here a list of all available languages and their code*:
ara
bul
chs
cht
hrv
cze
dan
dut
eng
fin
fre
ger
gre
hun
kor
ita
jpn
pol
por
rus
slv
spa
swe
tur
The third parameter allows the OCR Engine selection:
OCREngine1
OCREngine2
OCREngine3
Read more about them here.
Now we can go ahead and start reading some text; there are three method that allow you to do it:
config.ParseFromUrl("https://example.com/image.png")
config.ParseFromLocal("path/to/the/image.jpg")
config.ParseFromBase64("data:image/jpeg;base64,873hf9qehq98efwuehf...")
Method names are self-explanatory.
Remember:
.ParseFromBase64 need as parameter a valid Base64 format like data:<file>/<extension>;base64,<image>
where:
<file>
is application
in case of a pdf file or image
in case of an image<extension>
is the extension of the file you encode. Only valid are pdf
, jpg
, png
and gif
<image>
is the actual encode of your fileSo basically these methods will give you back the whole struct complete of all parameters that OCR.Space provides to you.
If you are only interested in the output text call .justText()
method at the end of one of the three methods mentioned above.
package main
import (
"fmt"
ocr "github.com/ranghetto/go_ocr_space"
)
func main() {
//this is a demo api key
apiKey:="helloworld"
//setting up the configuration
config := ocr.InitConfig(apiKey , "eng", OCREngine2)
//actual converting the image from the url (struct content)
result, err := config.ParseFromUrl("https://www.azquotes.com/picture-quotes/quote-maybe-we-should-all-just-listen-to-records-and-quit-our-jobs-jack-white-81-40-26.jpg")
if err != nil {
fmt.Println(err)
}
//printing the just the parsed text
fmt.Println(result.JustText())
}
Maybe we should all just listen to
records and quit our jobs
Jack White
AZ QUOTES
FAQs
Unknown package
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Research
The Socket Research Team breaks down a malicious wrapper package that uses obfuscation to harvest credentials and exfiltrate sensitive data.
Research
Security News
Attackers used a malicious npm package typosquatting a popular ESLint plugin to steal sensitive data, execute commands, and exploit developer systems.
Security News
The Ultralytics' PyPI Package was compromised four times in one weekend through GitHub Actions cache poisoning and failure to rotate previously compromised API tokens.