Date: prev next · Thread: first prev next last
2011 Archives by date, by thread · List index


Hylton,

I can offer a few general considerations and thoughts. Your problem statement appears to contain two objectives:

a) to make consistent the case of all the elements; all Upper Case; all Lower Case; whatever. It seems the case makes no difference to the meaning of the record. The case is just an artifact of the typing style of the data input person(s). In this situation, a mass application of =UPPER or =LOWER and placing the result in a new sheet would suffice.

I think no amount of special formatting will reliably catch your eye to eliminate manually all the instances of differing case.

b) the deduplication. Once the case issue is resolved you will have rows with all elements exactly equal. This deduplication should be done by exporting the 68000 x 10 sheet to a database and running a deduplication query. Again any of the visual tricks to identify duplicate rows will be unreliable and you will be guaranteed to miss at least a few.

At this point my advise bogs down. I have past experience with the M$ Access product; recent versions of Access have a pre-written deduplication query. It is not available to me right now because I have left my workplace and don't have an installation of that recent version of M$ Suite.

Use caution in the deduplication process; make lots of backups. It is a delete-type query and data will be lost! Hopefully only the duplicates but you never know.

I hope this vague hand waving is of some help to you,

--
David S. Crampton

On Mon, 12 Dec 2011 06:39:31 -0800, Hylton Conacher (ZR1HPC) <hylton@conacher.co.za> wrote:

Hi,

Using LibreOffice 3.3.1 and am in the process of editing a 68000 row, 10
column file.

There are two main columns that contain the data to be cleaned up with
multiple  instances of duplication i.e. the same text but only the text
case differs between two rows or the text is totally different in column
A row 1 and row 2 but the text in column B rows 1 and 2 is identical i.e.

Col A   Col B   OR      Col A   Col B
a       hx              a       hx
A       hx              a       hx

OR

a       hx              a       hx
a       hy              A       hy

etc for the other combinations

I am doing the alphabetical sort via Col A.

I can use find to search for the duplicate record row once I know what I
am looking for however determining what test is different when the
values in the Col A are the same and vice versa/

On 136000 cells this is a FAIR mission!

I would like to know if there is a conditional formula I could use that
could highlight the differences in one column when cells in the other
column are the same. I am thinking of a formula that says if the cell
contents are the same as any other cell in a range, apply the
conditional format. Of course this conditional would need to be added
onto all 136000 cells. :(

That way I can highlight the 'error' cells and find them easily and
correct them or add a new row of data.

Any pointers would be appreciated for doing this in Calc as an external
database is not available. What elements of the formula can I investigate?

Many thanks
Hylton

--
For unsubscribe instructions e-mail to: users+help@global.libreoffice.org
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/users/
All messages sent to this list will be publicly archived and cannot be deleted

Context


Privacy Policy | Impressum (Legal Info) | Copyright information: Unless otherwise specified, all text and images on this website are licensed under the Creative Commons Attribution-Share Alike 3.0 License. This does not include the source code of LibreOffice, which is licensed under the Mozilla Public License (MPLv2). "LibreOffice" and "The Document Foundation" are registered trademarks of their corresponding registered owners or are in actual use as trademarks in one or more countries. Their respective logos and icons are also subject to international copyright laws. Use thereof is explained in our trademark policy.