問題描述
如何以編程方式讀取 .pdf 文件並將其轉換為音頻(.mp3 格式)? (How to read a .pdf file programmatically and convert it into audio (.mp3 format)?)
I want to parse a PDF file from my C# app and create an audio file off it. How would I do that ?
I'm particularly looking for a good pdf to text library or a way to strip a pdf file off its text.
參考解法
方法 1:
You preferably have a tagged PDF document as your input document. This means that the document contains tags to mark up the logical structure of the document (typically a PDF document will only contain visual information).
This PDF could then be converted into DAISY format, which is a standard for digital talking books, i.e. an intermediate XML format storing the text of books along with the logical structure and navigation features.
This Daisy XML format can be either converted to an audio format, or you could be using a Daisy reader, a physical device like an MP3 player to listen to the book.
There is a presentation available at the Daisy web site explaining the principles of this toolchain:
Accessible PDF to DAISY/NIMAS Conversion
方法 2:
Use Festival for the text to speech. Various pdf to text api's exist...
方法 3:
You need the Speech SDK from Microsoft. Read an instruction here
方法 4:
As the other posters outlined, first you have to extract the text from the .pdf file. pdf files are an open format now, so you can probably find a parser through Google.
Then you have to extract the text you want to convert to speech from the file, ignoring things like figure titles, page headers, table of contents etc.
Once you've got the text, you need to convert it to speech. This is probably the hardest part.
A while ago I was fiddling around with generating voice files for a gaming mod, since I'm a rotten voice actor.
Cepstral had the best TTS converters I could find. (The free ones had an annoying tendency to insert Cepstral advertisements in the speech, but I could manually edit this out for what I was doing.)
It turns out that there's a speech synthesis markup language which can be used to provide clues to the TTS converter about which syllable to place accents, etc. Here's a linky:
http://www.w3.org/TR/speech-synthesis/
How you go about automatically adding the SSML to the text is a bit beyond me.
Anyway, the TTS converter will produce an audio file, and the final step would be to compress the audio at the desired bit rate in mp3 format.
方法 5:
If your sole task is to listen to speech synthesized text from a PDF, how about the Acrobat "Read out loud" function at the bottom of the "View" menu?
(by Attilah、Dirk Vollmar、dicroce、jao、billmcc、spender)