
History
-------

		Things which are currently not implemented yet in 3.2.x:
		
		* Approved sites.
		* CrossWords do not work as expected. Images are inserted
		  with wrong crosswords.

26 Sep 2003: 3.2.15
	* HTDBAddr command was added to index SQL tables from different
	  databases.
	* Range support for ftp was added, MP3 tags indexing is now
	  possible for ftp.
	* Phrase segmentation for search queries in Thai,
	  Chinese and Japanese languages was added.
	* HTDBLimit command was added to avoid huge memory usage for big tables.
	* Thai language phrases segmenter was added. 
	  Use LoadThaiList command to enable.
	* One now can increase and decrease indxer log level using
	  SIGUSR1 and SIGUSR2 signals.
	* ResultsLimit command was added to allow reduce maximum
	  number of results.
	* search.cgi now prints more HTML 4.01 compliant HREF values,
	  i.e. "&amp;" rather than "&".
	* GuesserUseMeta command was added.
	* SQLite support (http://www.sqlite.org/) was added.
	* Built-in support was removed, use SQLite instead.
	* hops calculation for multilingual documents was fixed.
	* Several bugs (#400, #402, #407, #409, #412, #435) were fixed.

29 Jul 2003: 3.2.14
	* Search can now order results by relevancy, popularity, date.
	  An option to choose results ordering was added into search.htm-dist.
	* Ability for automatic language maps update was added.
	  Use "LangMapUpdate yes" command to enable.
	* MaxHops is now checked before adding new URLs into database.
	* "splitter" crash after indexing only a few documents was fixed.
	* Empty search results with multiple DBAddr were fixed.
	* NoMatch option for Realm and Server commands was fixed.
	* Memory leaks were fixed.
	* Memory corruption during relevancy calculation was fixed.
	* Normalization of words which appear in dictionaries for several
	  lanuages was fixed.
	* unclosed file while cached checking-up fixed.

10 Jul 2003: 3.2.13
	* Check-up functionality for "stored" database was added.
	* "stored" connection locking for multi-threaded version was added.
	* A trap in search.cgi being executed without "stored" was fixed.
	* "indexer -Ecreate" and "indexer -Edrop" now work for
	  Oracle and MS SQL databases.
	* "indexer -q" was restored. A bug from 3.2.12.
	* A trap in multi-threaded indexer being executed in cache dbmode
	  without "cached" running was fixed.
	* "indexer -Esqlmon" now starts indexer in SQL monitor mode. One
	  can execute SQL queries against backends given in DBAddr
	  indexer.conf commands.
	* Optional readline support for "indexer -Esqlmon" was added.
	* configure failure with expat path explicitely specified was fixed.
	* "Follow world" indexer.conf command was fixed.
	* ServerTable syntax was fixed in etc/indexer.conf-dist sample

25 Jun 2003: 3.2.12
	* HTTPS for systems without /dev/random or /dev/urandom was fixed. 
	* You can create and drop database structure using
	  "indexer -Ecreate" and "indexer -Edrop" correspondently.
	* Phrases detection was fixed.
	* Installation problem that appeared in some cases was fixed.

20 Jun 2003: 3.2.11
	* Buffer overflow exploit was fixed in search.cgi
	* There is no Limit on url length (256 bytes) anymore.
	  Please update db structure when upgrading from the previous version.
	* Check-up functionality for cached database was added.
	* MeCab japanese morphological analyzer support was added.
	  Use --enable-mecab option for configure to enable it.
	* Log2stderr command was added.
	* UdmStrCRC32 replace by UdmStrHash32 everywhere except crc32 itself.
	  It's faster and produces less collisions.
	  Full reindexing is needed if upgrade is performed.
	* PopRankUseTracking, PopRankUseShowCnt, PopRankShowCntRatio and
	  PopRankShowCntWeight commands were added.
	* Multi DBAddr support added. LogdAddr, StoredAddr, TrackQuery commands
	  was removed. See new parameters of DBAddr command.
	* Charset guessing for the case when no language maps are
	  loaded was fixed.
	* search.cgi was not able to open cache-mode files in some cases.
	  creation mode for var/tree/url* files was fixed.
	* qtrack table was separated into qtrack and qinfo tables.

11 Apr 2003: 3.2.10
	* <base href=...> processing was fixed.
	* Bug #339: all words truncating at 4 characters was fixed.
	* <!ELSEIF and <!ELIF processing in templates was fixed.

07 Apr 2003: 3.2.9
	* Synonyms list for french language was added.
	* Big synonyms list for russian language was added.
	* VaryLang command was added for indexing multilingual servers
	* -s switch for cached added to specify sleep time at start-up.
	* <META NAME="robots" CONTENT="NOARCHIVE"> processing was added.
	* server table was splited on server and srvinfo.

30 Jan 2003: 3.2.8
	* Unigrams were removed from language and charset guessing. This
	  makes guessing faster and in some cases better.
	* Lithuanian stopword file was added. Thanks Arnoldas Lukaeviius.
	* mconv utility was added.
	* Georgian geostd8 charset support was added.
        * "DateFormat" template variable was added.
	* indexer now can use UDM_CONF_DIR environment variable.
	* MP3 parser doesn't convert into HTML anymore. New sections
	  MP3.Album, MP3.Song, MP3.Artist and MP3.Year were added.
	* "Server" and "Realm" commands can now take a new optional
	  argument to specify an action which will be applied for
	  documents matching this command. For example, 
	  "Server HrefOnly http://localhost/" forces indexer to
	  download given documents, to get new links from them
	  without but indexing of documents content.
	* "Follow" command was removed. Use "Server" or "Realm" instead.
	* text/xml indexing was added, needs Expat library to be installed.
	  Use --with-expat configure switch to activate.
	* search.htm now supports environment variables, e.g.
	  $(ENV.HTTP_REMOTE_ADDR)
	* New <!IF> ... <!ELSEIF> ... <!ELSE> ... <!ENDIF> syntax
	  is now supported in search.htm. 
	* New <!SET  NAME="dst_var_name" CONTENT="value"> search.htm command.
	* New <!COPY NAME="dst_var_name" CONTENT="src_var_name"> search.htm command.
	* Japanese euc-jp language map was added
	* Chinese sentence segmenter added. You should enable GB2312
	  charset support and add LoadChineseList command to enable it.
	* ChaSen japanese morphological analysis system support was added.
	  Use --enable-chasen option for configure to enable it.
	* "Limit" command syntax was simplified.
	* zlib support is now enabled in "configure" by default.
	* "PopRankSkipSameSite" command was added. It allows not to count
	  links which from the same site for Popularity Rank calculation.
	
11 Oct 2002: 3.2.7
	* Popularity rank was added.
	* New search CGI-parameters "sp" and "sy" to enable/disable
	  words forms and synonyms in search results respectively.
	* Chinese stoplist and language map were added.
	* Search limit by url.content_type was added.
	* Document score is now displayed in percents.
	* Now one can index specified tag attributes.
	* Search results now can be groupped by site. 
	* Default MaxDocSize value is now 2 Mb.
	* Pages can be indexed in their hops order using "indexer -o".
	  Distinct criteria on site_id for PgSQL was added.
	* New "ParserTimeOut" indexer.conf command to avoid indexer hanging
	  with external parser.

19 Jun 2002: 3.2.6
	* If a document is fetched using a compressed (gzip/compress/deflate)
	  transfer encoding, the original (uncompressed) size is stored now 
	  into url.docsize.
	* search.cgi now doesn't fetch whole document from "stored" to display
	  the search words excerptions. Fetching stops when enough excerptions
	  have been already built.
	* Fixed that CVS version now can be built when there is no
	  jade/openjade installed. In 3.2.5, "make install" failed on
	  attempt to install the documentation.
	* Some bugs were fixed.

27 May 2002: 3.2.5
	* Separate DBMode command was removed. 
	* DBAddr command was extended to support this syntax: 
		DBAddr mysql://user:pass@host/dbname/?dbmode=multi
	* "ServerTable" indexer.conf command systax was changed. Use this
	  style syntax: "ServerTable mysql://user:pass@host/dbname/tablename"
	  to load servers information from "tablename" SQL table. Note that
	  now you can load servers information using a database different
	  from the specified one in "DBAddr" command.
	* "DBAddr" argument format for Interbase was changed. Use 
	  "DBAddr ibase://hostname/path/to/mnogosearch.gdb/" with trailing 
	  slash after the *.gdb file name.
	* A trap on too long and escaped URLs was fixed.
	* Some memory leaks were fixed.
	* Fixed that incorrect SQL queries in "single" and "crc" DBModes 
	  were sent to server when the first word on a page is a stopword.
	  Thanks to luc at lvb.net

15 May 2002: 3.2.4
	* Renamed template variables responsible for displaying document
	  sections. Take a look into search.htm-dist.
	* Added a possibility to specify length for documents sections,
	  stored into database (body, title, etc).
	* Added OptimizeInterval and OptimizeRatio stored.conf commands.
	* <!--stored--> section is now processed like <!--clone--> section.
	* "stored" now supports "delete" and "optimize" operations. 
	* search.cgi used with "stored" can now return excerpts from document
	  around a place where search word is found.
	* Added new "StoredFiles" stored.conf command to limit a number of
	  archives used by "stored" daemon.
	* news:// and nntp:// retrival system now supports authorization in
	  both AuthBasic indexer.conf command and in URL part, for example:
	  news://user:pass@servername/
	* Indexer's code is now more thread safe.
	* Added cache mode limits for searchd.
	* All cgi programs now use syslog (like indexer does).
	* Added documents mixing while indexing to avoid "rapid fire".
	* "Charset" indexer.conf command has been renamed to "RemoteCharset".
	* Added ISO-2022-JP charset support.
	* Added TSCII and MacGujarati charsets support.
	* Asian Big5, gb2312, EUC-KR and Shift-JIS charsets are
	  not compiled by default anymore. This allows to reduce binaries
	  size. Use --with-extra-charsets to activate compilation of
	  these charsets.
	* Added NL, TL, BG, SV, DA, FR, ES, DE, HR languages 
	  maps built on Bible.
	* Added esperanto language maps. Thanks to Arto Sarle <arto@sarle.com>
	* Now one can load several files for the same lang + charset 
	  combination. It improves guesser results quality.
	* Some other improvements in language and charset guesser.
	* Removed command DeleteNoServer. Use "Follow world" instead.
	* Removed "SearchdAddr hostname:port" template command.
	  Use "DBAddr searchd://hostame:port/" instead.
	* Fixed that query words were not converted to LocalCharset before 
	  storing in "qtrack" table.
	* Fixed that the same word form could appear twice in $(W) variable.
	* $(DT) now displays URL if title is empty. Usefull for text/plain
	  documents.
	* Fixed that indexer could loop robots.txt fetching in some cases.
	* Fixed some compilation problems on MAC OS X and IRIX.
	* Make shared libraries by default

24 Nov 2001: 3.2.3
	* Added that now it's possible to specify an alternative non-default 
	  path to MySQL socket when connection to localhost. Use for example:
	  DBAddr mysql://user:pass@hostname/database/?socket=/tmp/mysql.sock
	* Added 'src','width','height','size' and 'class' attributes 
	  processing in templates
	* Added wordinfo and searchwords highlighting when searchd used.
	  ATN: you need clear search cache, because format was changed. 
	* Added search results cache support for searchd.
	* Added queries tracking at searchd.
	* Added indexer switch -b to block start of several indexer instances. 
	  Useful for example when indexers are started from crontab.
	* Added new template section <!--stored-->. Now search.cgi checks that
	  document presents in stored and fill this section only on success.
	* Added new formatting in template variables. Now it's possible 
	  to limit variable displayed length. Use $(DU:40) to limit URLs to 
	  40 characters. This helps for example to avoid breaking tables 
	  structure in search results when URL is long enough.
	* Added new fields in query tracking system. Now it stores user's IP 
	  address. Don't forget to ALTER qtrack table according to new 
	  structure.
	* Added new MacCE, MacCroatian, MacGreek, MacRoman, MacTurkish, 
	  MacIceland, MacRomania, MacThai, MacArabic, MacHebrew character sets.
	* Added Catalan stopwordslist, thanks Jordi Gay Sensat <jgay@ajgirona.org>
	* Added Hungarian stopwordslist, thanks MURANYI Andras <muranyia@iqconsulting.hu>
	* Added Azerbaijani language maps for guesser.
	* Added Swedish stoplist. Thanks Johan Olde <johan.olde@phosworks.se>
	* Fixed that indexer could not connect to stored on remote machine.
	* Fixed that $(W) variable was not recoded to BrowserCharset when
	  no search results were found.
	* Fixed that MinWordLen and MaxWordLen didn't work in search.htm.
	* Fixed that Alias command were not working in search.htm.
	  Thanks Matthew Sullivan <matthew@netscape.com>
	* Fixed that robots.txt content were indexed as a usual text file
	  in some cases.
	* Fixed that <META NAME="Robots" CONTENT="NOINDEX,NOFOLLOW"> were
	  not working in some cases.
	* Fixed that variables where not substituted by their values in 
	  <!INCLUDE CONTENT="http://servername/include.cgi?q=$%(q)">
	* Fixed that links like http://xx/yy?a=b&#38;c=d where not
	  properly converted to http://xx/yy?a=b&c=d.
	* Fixed that mnogosearch could not connect to remote
	  Interbase server, as well as other minor Ibterbase bugs.
	  Thanks Henner.Kollmann@gmx.de.
	* Fixed that search.cgi crashed when categories list is
	  reqeusted but table doesn't extist.
	* Fixed minor bug in robots.txt processing. Thanks Tim Pierce
	  <twp@unchi.org>.
	* Fixed compilation problems with ODBC libraries.Bug since 3.2.0.
	* Fixed EasySoft ODBC libraries linking problem which appeared because 
	  EasySoft changed names of their libraries. Now configure substitutes 
	  new libraries names to Makefile.

24 Oct 2001: 3.2.2
	* Added meta "Content-Language" processing, added "lang" attribute
	  processing for <html> and <body> tags.
	* Added IBM DB2 support. Tested with DB2 EE V7.1.
	* Stored and storedoc.cgi added. Now it possible to store and
	  display compressed copy of indexed documents with 
	  search words hilighting.
	* Tag values are now passed using "tag" form variable so that 
	  the variable meaning is more clean. Old "g" form variable does not
	  work anymore.
	* Major documentation improvements and reorganization.
	* Fixed that category and language limits were not working.
	* Fixed that StopwordFile command didn't work in search.htm
	* Fixed that full/substring/beginning/ending word match didn't
	  work.
	* Fixed crash in ServerTable code.
	* Fixed crash in synonyms code on some platforms.
	* qtrack table fileds types changed.
	* Fixed bug in MySQL single mode code. It could kill mysqld 
	  server when documents is big enough.
	* Fixed that iso-8859-1 entities like &eacute; were
	  not properly converted to unicode.
	* Fixed that HTML parser considered scripts body as
	  a text in some cases.
	* install.pl installation script has beed added.
	* Some minor configure script and code clean-ups.

27 Sep 2001: 3.2.1
	* New "Listen" searchd.conf command. It allows to bind
	  searchd to specified host and/or port.
	* searchd now can reload searchd.conf when
	  signal HUP is arriving.
	* Added some signal safeness in searchd.
	* Fixed that searchd.conf-dist were not included into
	  distribution.
	* Fixed that national letters in the code range.
	  128-255 were considered as word separators when
	  searchd is used. They also were not displayed in
	  search results (body, title, etc fields).
	* Fixed some bugs in HTML tag parsers that caused 
	  indexer to stall or crash in some cases.
	* Fixed that "Proxy" command was ignored.
	* Fixed that robots.txt related code could
	  stall or crash in threaded version.
	* Fixed compilation with Oracle problem.
	* Fixed compilation problem with errno.h on Solaris.

24 Sep 2001: 3.2.0
	* Now one can compile with several SQL databases support at
	  the same time.
	* Now one can make a bibary distribution using "make bin-dist".
	* Added new program searchd. Among other features, it allows 
	  to build a search cluster, distributing between several machines.
	* Support for synonyms fuzzy search has been added.
	* Common words endings fuzzy search using ispell now works in 
	  3.2 branch.
	* New "ReverseAlias" indexer.conf command. This command has
	  the same format with "Alias" command. However, URL mapping
	  is executed just after the moment when new link has been found.
	  URL is stored into database after ReverseAliases applying.
	  Among other things it allows for example to index PHP driven 
	  sites which add an unique session ID in the form 
	  "PHPSESSION=344646342345df". ReverseAlias is able to remove such
	  substrings from URLs.
	* New "Subnet xxx.xxx.xxx.xxx" indexer.conf command. It works
	  like Realm but checks an IP address matching instead of URL. 
	  For example, "Subnet 195.239.38.*" or "Subnet NoMatch 192.*.*.*".
	* Search results highlighting (HlBeg and HlEnd search.htm commands)
	  now works in 3.2 branch.
	* CT-Lib support has been added. Now one can
	  use mnoGoSearch together with SyBase and MS SQL
	  natively, without ODCB drivers. Both original SyBase
	  CT-Lib and FreeTDS CT-Lib are suppored. However Ct-Lib 
	  driver is still in beta.
	* indexer now works approximately twice faster with Interbase.
	* Added deflate and compress Content-Encoding's support.
	* New VarDir command in search.htm. It works like the same
	  indexer.conf command but at search time.
	* New "Section" indexer.conf command. It is to be used instead of
	  old ***Weight commands, which have been removed. Take a look into
	  indexer.conf-dist and search.txt for an explanation.
	* Now it is possible to index user-defined META tags as well as
	  HTTP response headers.
	* New "Alias" command in search.htm. It works like "Alias"
	  in indexer.conf but at a search time.
	* Added support for external includes in search template.
	  Format differs from 3.1.x version. Take a look into
	  "templates.txt" for usage information.
	* "Alias" command has been extended. Now it can optionaly use 
	  powerful URL mapping using regular expressions like in "Realm"
	  command.
	* Posix threads now should work not only Linux and FreeBSD.
	  Detection for threads for a number of platforms has been added.
	* libudmsearch compilation with pthreads fix. It fixes 
	  Apaches with PHP mnoGoSearch extension module crashes
	  when mnoGoSearch was compiled with pthreads support.
	* Tag parser has been rewritten. It now properly process
	  tag attrubites with '>' signs, like for example 
	  <META NAME=email Contents="<general@mnogosearch.org>">.
	  Earlier '>' signs inside quotes was consideter as a tag
	  endings.
	* Apple Darwin fixes for configure scripts
	* Extended number of query parameters stored in qtrack table
	* Added url.charset field. Charset is now stored separately from
	  content_type field. Please recreate or ALTER "url" table structure.
	* "Clones yes/no" has been renamed to "DetectClones yes/no"
	  to avoid confusions.


08 Aug 2001: 3.2.0.b2
	* Added Thai TIS-620 (aka iso-8859-11) charset support.
	* Content encoding support added (currently gzip only).
	  Requires libz to compile. Use --with-zlib to activate.
	* Fixed that $(DE) was not working
	* Fixed that the correct charset was forgotten after 
	  robots.txt processing.
	* Fixed several bugs in new "cache mode".


03 Jul 2001: 3.2.0.b1
	* Charsets processing has been rewritten. Now mnoGoSearch supports 
	  almost all widely used charsets: various single-byte charsets as 
	  well as multi-byte charsets including UTF-8, Chinese (BIG5, GB2312), 
	  Korean (EUC-KR), Japanese (S-JIS). All internal processing works
	  using UNICODE representation. Using UTF8 as a LocalCharset one can 
	  build a multi-lingual search engine with languages which could not 
	  be indexed at the same time in 3.1.x branch, for example 
	  German+Greek+Russian+Chinese.
	* Character sets module has a new automatic language and charset 
	  detection. Currently more than 70 various charsets and languages
	  can be detected automatically when they are not specified in
	  "Content-type" and "Content-Language" server's response headers or 
	  html META tags.
	* News extensions now compiled without --enable-news-extensions.
	  Use "NewsExtensions yes" indexer.conf command to activate them.
	* search.cgi has been rewritten.
	* Cache-mode has been rewritten.
	* Fixed template variables format. Now 
		$(x)  is plain variable value, 
		$&(x) is a HTML-escaped value
		$%(x) is a value escaped to be used in URLs
