Показать сообщение отдельно
Старый 10.10.2004, 11:13
YuriyA вне форума Посмотреть профиль Отправить личное сообщение для YuriyA Найти все сообщения от YuriyA
  № 9  
YuriyA

Регистрация: Mar 2004
Сообщений: 68
И всё таки, для того, чтобы избежать недопонимания я приводу описание в оригинале.

README for Macromedia Flash Search Engine SDK 1.0


Contents of this readme file

* Overview
* What swf2html extracts from a SWF file
* Sample output
* Command-line implementation
* Linked library implementation
* Build notes

Overview

The Macromedia Flash Search Engine SDK 1.0 provides search engines with the means to search and index Macromedia Flash (SWF) movies. The swf2html utility used by the SDK extracts text and links from a Macromedia Flash SWF file, and outputs it to stdout or to an HTML document. When a search engine deploys this SDK, users can locate relevant Flash content when searching by keyword or file type.

The SDK includes the following:

* swf2html executable files, for command-line implementation
* libswf2html static libraries, for linked library implementation
* source code
* a zlib decompression library for decompressing Flash Player 6 files that have been compressed (the zlib files are included in the swf2html.dsw project file)

The SDK supports SWF files created for Flash Player 3, 4, 5 and 6, and can be run on Windows or Linux systems. All code in the SDK is written in the C++ language.

Back to contents
What swf2html extracts from a SWF file

You can use swf2html to extract links, text, or both. By default, both links and text are extracted.
Links

By default, swf2html extracts links (URLs) from within ActionScript code that meet the following criteria:

* the link is contained within single quotes (' ') or double quotes (" "), and
* the link prefix is HTTP, and
* the link suffix is HTM, HTML, CFM, SWF, JPG, JPEG, GIF, MP3, or WAV

To specify different default prefixes and suffixes for links to be extracted, you can edit the swf2html.cpp file. You can also specify additional prefixes and suffixes to be extracted at run time if you use the Command-line implementation. Note that swf2html is not case-sensitive.

Back to contents
Text

By default, swf2html extracts the following text:

* text on stage in the current movie (dynamic text, static text, or input text that has an initial value assigned)
* text on stage in a movie that is called with movieClip.attachMovie()

When a text field on the stage contains plain text, swf2html outputs it as plain text, with the following characters converted into HTML entities:
Text character HTML entity
< &lt;
> &gt;
& &amp;
" &quot;

When a text field on the stage contains HTML code, the initial text displayed in the text field is included as HTML in the output file. For example, the HTML text field "Text link to Macromedia (center)", displayed in purple, with a hyperlink to http://www.macromedia.com, might be extracted into the output file as follows:

<P ALIGN="CENTER">
<FONT FACE="Tahoma" SIZE="24" COLOR="#990099">
<A HREF="http://www.macromedia.com">Text link to Macromedia (center)</A>
</FONT>
</P>

Note that the prefix and suffix filters specified for links do not apply to text. That is, if a text field on the stage contains an HTML tag such as <A HREF="ftp://myFTPsite.com/mySubdir">, swf2html extracts this text even if FTP is not specified as a prefix type to be extracted.

Back to contents
Sample output

The following samples show the output swf2html created for the Flash Player 6 SWF file shown below.

Note: The character set of Flash Player 6 and later SWF files is Unicode, encoded using UTF-8. When you output data from a Flash Player 6 or later SWF file to an HTML file, the following META tag is added to the HTML output:

<meta http-equiv="content-type" content="text/html; charset=utf-8">

This META tag is not placed in the HTML output for Flash Player 5 and earlier SWF files. For those files, the character set is either ISO-8859-1 or Shift-JIS.
Default output

<html>
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title></title>
</head>
<body>
<p>Links
</p>
<a href="http://www.macromedia.com">http://www.macromedia.com</a>
<p>Contact Us
</p>
<a href="http://www.macromedia.com">http://www.macromedia.com</a>
<p>Sweepstakes
</p>
<a href="http://www.macromedia.com">http://www.macromedia.com</a>
<p>The End of Compromise

The Stiletto answers questions no one dared to ask. Can you make a performance electric car? Can you make a luxury car affordable? Can you make a small car safe? Can I drive from LA to Vegas in an electric car?

Z.E.V. has an answer: Yes.</p>
</body>
</html>
Links-only output

<html>
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title></title>
</head>
<body>
<a href="http://www.macromedia.com">http://www.macromedia.com</a>
<a href="http://www.macromedia.com">http://www.macromedia.com</a>
<a href="http://www.macromedia.com">http://www.macromedia.com</a>
</body>
</html>
Text-only output

<html>
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title></title>
</head>
<body>
<p>Links
</p>
<p>Contact Us
</p>
<p>Sweepstakes
</p>
<p>The End of Compromise

The Stiletto answers questions no one dared to ask. Can you make a performance electric car? Can you make a luxury car affordable? Can you make a small car safe? Can I drive from LA to Vegas in an electric car?

Z.E.V. has an answer: Yes.</p>
</body>
</html>

Back to contents
Command-line implementation
Syntax

swf2html [-l] [-t] [-o outputFile] [-p prefixes] [-s suffixes] inputFile

-l extracts links only
-t extracts text only
-o specifies a filename for the output
-p specifies a prefixes filter string in the form "pre1|pre2"
-s specifies a suffixes filter string in the form "suf1|suf2"

Back to contents
Notes
Specifying both -l and -t has the same effect as specifying neither; that is, the default behavior of extracting both links and text will take place.

Text on the stage is not parsed for links. That is, if a textbox on the stage contains the text "http://www.somesite.com", it is not extracted if you specify -l, and it is extracted if you specify -t.

If you omit outputFile, data is extracted to stdout. If you specify outputFile, include the .htm or .html suffix. (A suffix is not appended to outputFile automatically.)

If you specify prefixes or suffixes on the command line, these link types will be extracted in addition to (not instead of) the default types listed in swf2html.cpp.

To use stdin as the input source, use - as inputFile. This usage assumes that you are calling swf2html from within a program, and that you have previously passed to stdin the SWF data you want to parse.

Back to contents
Sample implementations

* To extract text and default links from input.swf, outputting to stdout:

swf2html input.swf

* To extract text and default links from input.swf, outputting to output.html:

swf2html -o output.html input.swf

* To extract text, default links, and links to ftp sites, .txt files, and .js files from input.swf, outputting to output.html:

swf2html -o output.html -p "ftp" -s "txt|js" input.swf

* To extract only text (no links) from input.swf, outputting to output.html:

swf2html -t -o output.html input.swf

* To extract text and default links from stdin, outputting to stdout:

swf2html -

Back to contents
Linked library implementation

The swf2html command-line utility uses the libswf2html library to perform its SWF-to-HTML conversion. This library is provided, so you can link it into an application to provide SWF-to-HTML conversion.

* On Windows, the binary libswf2html.lib is provided, and you can use the project file libswf2html.dsp to build the library.
* On Linux, the binary libswf2html.a is provided, and the provided Makefile will build the library. The STL map and string classes are used, so you must also link libstdc++ into the target application.

Back to contents
libswf2html API

The libswf2html library exposes an API that can be used to perform SWF-to-HTML conversion. For portability, the following data types are defined in swf2html.h:

* swf_U8: unsigned 8-bit integer
* swf_U16: unsigned 16-bit integer
* swf_U32: unsigned 32-bit integer
* swf_S8: signed 8-bit integer
* swf_S16: signed 16-bit integer
* swf_S32: signed 32-bit integer

There are two C++ classes in this API, Swf2HtmlConverter and Swf2HtmlConverterStdio.
class Swf2HtmlConverter

The Swf2HtmlConverter class is the core class that performs SWF-to-HTML conversion. Because it is a virtual base class, it must be subclassed to be used. To use the Swf2HtmlConverter class, use #include "swf2html.h" and subclass the following methods:

* Write a byte of data to the HTML output stream:

virtual void PutByte(swf_U8 ch) = 0;

* Write a null-terminated string to the HTML output steam:

virtual void PutString(const char *str) = 0;

* Read up to count bytes from the SWF input stream into buffer and return the number of bytes read:

virtual swf_S32 ReadInput(void *buffer, swf_S32 count) = 0;

* Display the specified string as an error message:

virtual void DisplayError(const char *str) = 0;

Once these routines are overridden, the ConvertSwf2Html method may be invoked to perform the SWF-to-HTML conversion.

The public functions offer read/write access to some flags and variables, as well as the entry point for starting the conversion process. See above for information about prefix and suffix filters and the distinction between links and text.

* Entry point to start reading the input data and to parse it into text and links, based on the overridden functions:

bool ConvertSwf2Html();

* Returns a boolean indicating whether the converter is set to output links:

bool GetDumpLinks() const;

* Sets the flag that indicates whether the converter should output links:

void SetDumpLinks(bool dumpLinks);

* Returns a boolean indicating whether the converter is set to output text:

bool GetDumpText() const;

* Sets the flag that indicates whether the converter should output text:

void SetDumpText(bool dumpText);

* Returns the string that contains the prefixes of links to be extracted:

const char *GetPrefixFilters() const;

* Sets the string that contains the prefixes of links to be extracted:

void SetPrefixFilters(const char *filters);

* Returns the string that contains the suffixes of links to be extracted:

const char *GetSuffixFilters() const;

* Sets the string that contains the suffixes of links to be extracted:

void SetSuffixFilters(const char *filters);

class Swf2HtmlConverter
{
public:
Swf2HtmlConverter();
~Swf2HtmlConverter();

bool ConvertSwf2Html();

bool GetDumpLinks() const;
void SetDumpLinks(bool dumpLinks);
bool GetDumpText() const;
void SetDumpText(bool dumpText);

const char *GetPrefixFilters() const;
void SetPrefixFilters(const char *filters);
const char *GetSuffixFilters() const;
void SetSuffixFilters(const char *filters);

protected:
virtual void PutByte(swf_U8 ch) = 0;
virtual void PutString(const char *str) = 0;
virtual swf_S32 ReadInput(void *buffer, swf_S32 count) = 0;
virtual void DisplayError(const char *str) = 0;

private:
... internal swf2html code ...
};

Back to contents
class Swf2HtmlConverterStdio

The Swf2HtmlConverterStdio class is a subclass of Swf2HtmlConverter. To use the Swf2HtmlConverter class, use #include "swf2html_stdio.h"

The following example demonstrates how Swf2HtmlConverter can be subclassed, and also provides for two common SWF-to-HTML conversion scenarios:

* Read the SWF from a stdio stream and write the HTML to another stdio stream. true is returned on success, false is returned on failure.

bool ConvertSwf2Html(FILE *inputFile, FILE *outputFile);

* Read the SWF from a named input file and write the HTML to a named output file. true is returned on success, false is returned on failure.

bool ConvertSwf2Html(const char *inputFile, const char *outputFile);

class Swf2HtmlConverterStdio : public Swf2HtmlConverter
{
public:
Swf2HtmlConverter();
~Swf2HtmlConverter();

bool ConvertSwf2Html(FILE *inputFile,
FILE *outputFile);
bool ConvertSwf2Html(const char *inputFile,
const char *outputFile);

protected:
virtual void PutByte(swf_U8 ch);
virtual void PutString(const char *str);
virtual swf_S32 ReadInput(void *buffer, swf_S32 count);
virtual void DisplayError(const char *str);

private:
... internal swf2html code ...
};

Back to contents
Build notes

swf2html has been built on Windows 2000 using MSDev v. 6.0 and on RedHat Linux 7.0 using gcc version egcs 2.91.66 (egcs-1.1.2 release). The project file and Makefile are included.

swf2html is C++ source code; a C++ compiler is required to build it.

Platform-dependent defines and includes should be placed in the platformbuild.h file, which every cpp source file includes.

Defines and includes that are available to all platforms should be included in the globals.h file, which the header files include.

The version of zlib used is zlib-1.1.4.

Back to contents