OCR_Space
OCR library is based on OCR.Space API, meant to read text from pdf files and images with Go language
Index
- Get your API key
- Installation
- Update
- Basic Usage
- Example Code
1. Get your API key
You can get your personal API key here (we will need this later)
2. Installation
go get -t github.com/ranghetto/go_ocr_space
3. Update
Delete the folder situated in $GO_PATH/src/github.com/ranghetto/go_ocr_space and then run this command:
go get -t github.com/ranghetto/go_ocr_space
4. Basic Usage
You need at first to create a configuration:
package main
import(
ocr "github.com/ranghetto/go_ocr_space"
)
func main(){
config := ocr.InitConfig("yourApiKeyHere", "eng", OCREngine2)
}
The first parameter is your API key as a string, the second one is the code of the language you want read from file or image.
Here a list of all available languages and their code*:
- Arabic =
ara
- Bulgarian =
bul
- Chinese(Simplified) =
chs
- Chinese(Traditional) =
cht
- Croatian =
hrv
- Czech =
cze
- Danish =
dan
- Dutch =
dut
- English =
eng
- Finnish =
fin
- French =
fre
- German =
ger
- Greek =
gre
- Hungarian =
hun
- Korean =
kor
- Italian =
ita
- Japanese =
jpn
- Polish =
pol
- Portuguese =
por
- Russian =
rus
- Slovenian =
slv
- Spanish =
spa
- Swedish =
swe
- Turkish =
tur
The third parameter allows the OCR Engine selection:
OCREngine1
OCREngine2
OCREngine3
Read more about them here.
Now we can go ahead and start reading some text; there are three method that allow you to do it:
config.ParseFromUrl("https://example.com/image.png")
config.ParseFromLocal("path/to/the/image.jpg")
config.ParseFromBase64("data:image/jpeg;base64,873hf9qehq98efwuehf...")
Method names are self-explanatory.
Remember:
.ParseFromBase64 need as parameter a valid Base64 format like data:<file>/<extension>;base64,<image>
where:
<file>
is application
in case of a pdf file or image
in case of an image<extension>
is the extension of the file you encode. Only valid are pdf
, jpg
, png
and gif
<image>
is the actual encode of your file
So basically these methods will give you back the whole struct complete of all parameters that OCR.Space provides to you.
If you are only interested in the output text call .justText()
method at the end of one of the three methods mentioned above.
5. Example Code
package main
import (
"fmt"
ocr "github.com/ranghetto/go_ocr_space"
)
func main() {
apiKey:="helloworld"
config := ocr.InitConfig(apiKey , "eng", OCREngine2)
result, err := config.ParseFromUrl("https://www.azquotes.com/picture-quotes/quote-maybe-we-should-all-just-listen-to-records-and-quit-our-jobs-jack-white-81-40-26.jpg")
if err != nil {
fmt.Println(err)
}
fmt.Println(result.JustText())
}
Example output
Maybe we should all just listen to
records and quit our jobs
Jack White
AZ QUOTES