Instructions to use impira/layoutlm-document-qa with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use impira/layoutlm-document-qa with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("document-question-answering", model="impira/layoutlm-document-qa")# Load model directly from transformers import AutoTokenizer, AutoModelForDocumentQuestionAnswering tokenizer = AutoTokenizer.from_pretrained("impira/layoutlm-document-qa") model = AutoModelForDocumentQuestionAnswering.from_pretrained("impira/layoutlm-document-qa") - Notebooks
- Google Colab
- Kaggle
working mechanism
#11
by BoccheseGiacomo - opened
I have a question: do this only works with text documents or also images? if i have a pdf formatted as image, do this work? and if i have a pdf with tables, do it convert all to raw text utf-8 or is able to process structures (images,tables,html text) as they are?
Thanks
As far as I can tell, it's just text from the images. and needs to be in a "segmentId" format.
However, check katanami here and also git https://github.com/katanaml/sparrow
thanks for the github repo, that's really cool
BoccheseGiacomo changed discussion status to closed