1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
|
/** \mainpage KMail architectural overview
\section KMail design principles
This file is intended to guide the reader's way through the KMail
codebase. It should esp. be handy for people not hacking full-time on
KMail as well as people that want to trace bugs in parts of KMail
which they don't know well.
Contents:
- Kernel
- Identity
- Filters
- ConfigureDialog
- MDNs
- Folders
- Index
- Headers
- Display
TODO: reader, composer, messages, accounts, ...
\section kernel KERNEL
Files: kmkernel.h, kmkernel.cpp
Contact Zack Rusin <[email protected]> with questions...
The first thing you'll notice about KMail is the extensive use of
kmkernel->xxx() constructs. The "kmkernel" is a define in kmkernel.h
declared as :
#define kmkernel KMKernel::self()
KMKernel is the central object in KMail. It's always created before
any other class, therefore you are _guaranteed_ that KMKernel::self()
(and therefore "kmkernel" construct) won't return 0 (null).
KMKernel implements the KMailIface (our DCOP interface) and gives
access to all the core KMail functionality.
\section identity IDENTITY
FIXME this has moved to libkpimidentities, right?
Files: identity*, kmidentity.{h,cpp}, configuredialog.cpp,
signatureconfigurator.{h,cpp}
Contact Marc Mutz <[email protected]> on questions...
Identities consists of various fields represented by
QStrings. Currently, those fields are hardcoded, but feel free to
implement KPIM::Identity as a map from strings to QVariants or somesuch.
One part of identities are signatures. They can represent four modes
(Signature::Type) of operation (disabled, text from file or command
and inline text), which correspond to the combo box in the
identitydialog.
Identities are designed to be used through the KPIM::IdentityManager:
const KPIM::Identity & ident =
kmkernel->identityManager()->identityForUoidOrDefault(...)
Make sure you assign to a _const_ reference, since the identityForFoo
methods are overloaded with non-const methods that access a different
list of identities in the manager that is used while configuring. That
is known source of errors when you use identityForFoo() as a parameter
to a method taking const KPIM::Identity &.
WARNING: Don't hold the reference longer than the current functions
scope or next return to the event loop. That's b/c the config dialog
is not modal and the user may hit apply/ok anytime between calls to
function that want to use the identity reference. Store the UOID
instead if you need to keep track of the identity. You may also want
to connect to one of the KPIM::IdentityManager::changed() or ::deleted()
Q_SIGNALS, if you want to do special processing in case the identity
changes.
Thus, in the ConfigureDialog, you will see non-const KPIM::Identity
references being used, while everywhere else (KMMessage,
IdentityCombo) const references are used.
The KPIM::IdentityCombo is what you see in the composer. It's a
self-updating combo box of KPIM::Identity's. Use this if you want the user
to choose an identity, e.g. in the folder dialog.
Ihe IdentityListView is what you see in the config dialog's identity
management page. It's not meant to be used elsewhere, but is DnD
enabled (well, at the time of this writing, only drag-enabled). This
is going to be used to dnd identities around between KNode and KMail,
e.g.
The SignatureConfigurator is the third tab in the identity
dialog. It's separate since it is used by the identity manager to
request a new file/command if the current value somehow fails.
\section filter FILTER
Contact Marc Mutz <[email protected]> on questions...
Filters consist of a search pattern and a list of actions plus a few
flags to indicate when they are to be applied (kmfilter.h).
They are managed in a QPtrList<KMFilter>, called KMFilterMgr. This
filter magnager is responsible for loading and storing filters
(read/writeConfig) and for executing them (process). The unique
instance of the filter manager is held by the kernel
(KMKernel::filterMgr()).
The search pattern is a QPtrList of search rules (kmsearchpattern.h) and a
boolean operator that defines their relation (and/or).
A search rule consists of a field-QString, a "function"-enum and a
"contents" or "value" QString. The first gives the header (or
pseudoheader) to match against, the second says how to match (equals,
consists, is less than,...) and the third holds the pattern to match
against.
Currently, there are two types of search rules, which are mixed
together into a single class: String-valued and int-valued. The latter
is a hack to enable \verbatim<size>\endverbatim and
\verbatim<age in days>\endverbatim pseudo-header matching.
KMSearchRules should better be organized like KMFilterActions are.
A filter action (kmfilteraction.h) inherits from KMFilterAction or one
of it's convenience sub-classes. They have three sub-interfaces: (1)
argument handling, (2) processing and (3) parameter widget handling.
Interface (1) consists of args{From,As}String(), name() and
isEmpty() and is used to read and write the arguments (if any) from/to
the config.
Interface (2) is used by the filter manager to execute the action
(process() / ReturnCode).
Interface (3) is used by the filter dialog to allow editing of
actions and consists of name(), label() and the
*ParamWidget*(). Complex parameter widgets are collected in
kmfawidget.{h,cpp}.
A typical call for applying filters is
KMKernel::filterMgr()
foreach message {
KMFilterMgr::process():
}
\section configuration CONFIGURE DIALOG
Files: configuredialog*.{h,cpp} ( identitylistview.{h,cpp} )
Contact Marc Mutz <[email protected]> on questions...
The configuredialog is made up of pages that in turn may consist of a
number of tabs. The genral rule of thumb is that each page and tab is
responsible for reading and writing the config options presented on
it, although in the future, that may be reduced to interacting with
the corresponding config manager instead. But that won't change the
basic principle.
Thus, there is an abstract base class ConfigurePage (defined in
configuredialog_p.h), which derives from QWidget. It has four methods
of which you have to reimplement at least the first two:
- void setup()
Re-read the config (from the config file or the manager) and update
the widgets correspondingly. That is, you should only create the
widgets in the ctor, not set the options yet. The reason for that is
that the config dialog, once created, is simply hidden and shown
subsequently, so we need a reset-like method anyway.
- void apply()
Read the config from the widgets and write it into the config file
or the corresponding config manager.
- void installProfile()
This is called when the user selected a profile and hit apply. A
profile is just another KConfig object. Therefore, this method
should be the same as setup(), except that you should only alter
widgets for configs that really exist in the profile.
For tabbed config pages, there exists a convenience class called
TabbedConfigurationPage, which (as of this writing only offers the
addTab() convenience method. It is planned to also provide
reimplementations of setup, dismiss, apply and installProfile that just
call the same functions for each tab.
\section mdn MDNs
Files: libkdenetwork/kmime_mdn.{h,cpp} and kmmessage.{h,cpp}, mostly
Contact Marc Mutz <[email protected]> on questions...
MDNs (Message Disposition Notifications; RFC 2298) are a way to send
back information regarding received messages back to their
sender. Examples include "message read/deleted/forwarded/processed".
The code in kmime_mdn.{h,cpp} is responsible for creating the
message/disposition-notification body part (2nd child of
multipart/report that makes the MDN) and for providing the template
for human-readable text that goes into the text/plain part (1st child
of the multipart/report).
The code in KMMessage::createMDN() actually constructs a message
containing a MDN for this message, using the kmime_mdn helper
functions. It starts by checking the index for an already sent MDN,
since the RFC demands that MDNs be sent only once for every
message. If that test succeeds, it goes on to check various other
constraints as per RFC and if all goes well the message containing the
multipart/report is created.
If you need to use this functionality, see KMReaderWin::touchMsg() and
KMFilterAction::sendMDN() for examples. The touchMsg() code is invoked
on display of a message and sends a "displayed" MDN back (if so
configured), whereas the KMFilterAction method is a convenience helper
for the various filter actions that can provoke a MDN (move to trash,
redirect, forward, ...).
\section folders Folders
Files: kmfolder*.{h,cpp}, folderstorage.{h,cpp} and *job.{h,cpp}
Contact Zack Rusin <[email protected]> with questions...
The collaboration among KMail folder classes looks
as follows :
KMFolderNode
/ \
/ \
KMFolderDir \
KMFolder
.
.
v
FolderStorage
|
|
KMFolderIndex
|
|
---< actual folder types: KMFolderImap, KMFolderMbox... >--
At the base KMail's folder design starts with KMFolderNode which
inherits QObject. KMFolderNode is the base class encapsulating
common folder properties such as the name and a boolean signifying whether
the folder is a folder holding mail directly or a KMFolderDir.
KMFolderNode's often do not have an on-disk representation, they are
entities existing only within KMail's design.
KMFolder acts as the runtime representation of a folder with the physical
storage part being represented by a member of type FolderStorage.
KMFolder and FolderStorage have many functions with the same names and
signatures, but there is no inheritance.
KMFolderIndex contains some common indexing functionality for physical folders.
Subclasses of KMFolderIndex finally interact directly with physical storage
or with storage providers over the network.
KMFolderDir is a directory abstraction which holds KMFolderNode's.
It inherits KMFolderNode and KMFolderNodeList which is a QPtrList<KMFolderNode>.
A special case of a KMFolderDir is KMFolderRootDir; it represents
the toplevel KMFolderDir in KMail's folder hierarchy.
KMFolderDir's contents are managed by KMFolderMgr's.
KMail contains three main KMFolderMgr's. They can be
accessed via KMKernel ( the "kmkernel" construct ). Those methods are :
1) KMFolderMgr *folderMgr() - which returns the folder manager for
the folders stored locally.
2) KMFolderMgr *imapFolderMgr() - which returns the folder manager
for all imap folders. They're handled a little differently because
for all imap messages only headers are cached locally while the
main contents of all messages is kept on the server.
3) KMFolderMgr *dimapFolderMgr() - which returns disconnected IMAP (dimap)
folder manager. In dimap, both the headers and a copy of the full message
are cached locally.
4) KMFolderMgr *searchFolderMgr() - which returns the folder manager
for search folders (folders created by using the "find
messages" tool). Other email clients call this type of folder
"virtual folders".
FolderJob classes - These classes allow asynchronous operations on
KMFolder's. You create a Job on the heap, connect to one of its
Q_SIGNALS and wait for the job to finish. Folders serve as FolderJob
factories. For example, to retrieve the full message from a folder
you do :
FolderJob *job = folderParent->createJob( aMsg, tGetMessage );
connect( job, SIGNAL(messageRetrieved(KMMessage*)),
SLOT(msgWasRetrieved(KMMessage*)) );
job->start();
\section index Index (old)
Files: kmfolderindex.{h,cpp} and kmmsg{base,info}.{h,cpp}
Contact Marc Mutz <[email protected]> or
Till Adam <[email protected]> or
Don Sanders <[email protected]>
with questions...
index := header *entry
header := magic LF NUL header-length byte-order-marker sizeof-long
magic := "# KMail-Index V" 1*DIGITS
header-length := Q_UINT32
byte-order-marker := Q_UINT32( 0x12345678 )
sizeof-long := Q_UINT32( 4 / 8 )
entry := tag length value
tag := Q_UINT32 ; little endian (native?)
length := Q_UINT16 ; little endian (native?)
value := tqunicode-string-value / ulong-value
tqunicode-string-value := 0*256QChar ; network-byte-order
ulong-value := unsigned_long ; little endian
Currently defined tag values are:
Msg*Part num. val type obtained by:
No 0 u -
From 1 u fromStrip().stripWhitespace()
Subject 2 u subject().stripWhitespace()
To 3 u toStrip().stripWhiteSpace()
ReplyToIdMD5 4 u replyToIdMD5().stripWhiteSpace()
IdMD5 5 u msgIdMD5().stripWhiteSpace()
XMark 6 u xmark().stripWhiteSpace()
Offset 7 l folderOffset() (not only mbox!)
LegacytqStatus 8 l mLegacytqStatus
Size 9 l msgSize()
Date 10 l date()
File 11 u fileName() (not only maildir!)
CryptoState 12 l (signatureState() << 16) | encryptionState())
MDNSent 13 l mdnSentState()
ReplyToAuxIdMD5 14 u replyToAuxIdMD5()
StrippedSubject 15 u strippedSubjectMD5().stripWhiteSpace()
tqStatus 16 l status()
u: tqunicode-string-value; l: ulong-value
Proposed new (KDE 3.2 / KMail 1.6) entry format:
index := header *entry
entry := sync 1*( tag type content ) crc-tag crc-value
sync := Q_UINT16 (32?) ; resync mark, some magic bit pattern
; (together with preceding crc-tag provides
; 24(40)bits to resync on in case of index
; corruption)
tag := Q_UINT8
type := Q_UINT8
content := variable-length-content / fixed-length-content
crc-tag := tag type ; tag=CRC16, type=CRC16
crc-value := Q_UINT16 ; the CRC16 sum is calculated over all of
; 1*( tag type content )
variable-length-content := length *512byte padding
padding := *3NUL ; make the string a multiple of 4 octets in length
fixed-length-content := 1*byte
length := Q_UINT16 (Q_UINT8?)
byte := Q_UINT8
The type field is pseudo-structured:
bit: 7 6 5 4 3 2 1 0
+-----+-----+-----+-----+-----+-----+-----+-----+
| uniq | chunk | len |
+-----+-----+-----+-----+-----+-----+-----+-----+
uniq: 3 bits = max. 8 different types with same chunk size:
for chunk = (0)00 (LSB(base)=0: octets):
00(0) Utf8String
01(0) BitField
10(0) reserved
11(0) Extend
for chunk = (1)00 (LSB(base)=1: 16-octet blocks):
00(1) MD5(Hash/List)
01(1) reserved
10(1) reserved
11(1) Extend
for chunk = 01 (shorts):
000 Utf16String
001-110 reserved
111 Extend
for chunk = 10 (int32):
000 Utf32String (4; not to be used yet)
001 Size32
010 Offset32
011 SerNum/UOID
100 DateTime
101 Color (QRgb: (a,r,g,b))
110 reserved
111 Extend
for chunk = 11 (int64):
000 reserved
001 Size64
010 Offset64
011-110 reserved
111 Extend
len: length in chunks
000 (variable width -> Q_UINT16 with the real width follows)
001..111: fixed-width data of length 2^len (2^1=2..2^6=128)
You find all defined values for the type field in indexentrybase.cpp
Currently defined tags are:
tag type content
DateSent DateTime Date:
DateReceived DateTime last Received:'s date-time
FromDisplayName String decoded display-name of first From: addr
ToDisplayName String dto. for To:
CcDisplayName String dto. for Cc:
FromAddrSpecs String possibly IMAA-encoded, comma-separated addr-spec's of From:
ToAddrSpecs String dto. for To:
CcAddrSpecs String dto. for Cc:
Subject String decoded Subject:
BaseSubjectMD5 String md5(utf8(stripOffPrefixes(subject())))
BodyPeek String body preview
MaildirFile String Filename of maildir file for this messagea
MBoxOffset Offset(64) Offset in mbox file (pointing to From_)
MBoxLength Size(64) Length of message in mbox file (incl. From_)
Size Size(64) rfc2822-size of message (in mbox: excl. From_)
tqStatus BitField (see below)
MessageIdMD5 MD5Hash MD5Hash of _normalized_ Message-Id:
MDNLink SerialNumber SerNum of MDN received for this message
DNSLink SerialNumber SerNUm of DSN received for this message
ThreadHeads SerialNumberList MD5Hash's of all (so far discovered)
_top-level thread parents_
ThreadParents SerialNumberList MD5Hash's of all (so far discovered)
thread parents
"String" is either Utf8String or (Utf16String or Latin1String),
depending on content
Currently allocated bits for the tqStatus BitField are:
Bit Value: on(/off) (\\imapflag)
# "std stati":
0 New (\\Recent)
1 Read (\\Seen)
2 Answered (\\Answered)
3 Deleted (\\Deleted)
4 Flagged (\\Flagged)
5 Draft (\\Draft)
6 Forwarded
7 reserved
# message properties:
8 HasAttachments
9 MDNSent ($MDNSent)
10..11 00: unspecified, 01: Low, 10: Normal, 11: High Priority
12..15 0001: Queued, 0010: Sent, 0011: DeliveryDelayed,
0100: Delivered, 0101: DeliveryFailed,
0110: DisplayedToReceiver, 0111: DeletedByReceiver,
1001: ProcessedByReceiver, 1010: ForwardedByReceiver,
1011-1110: reserved
1111: AnswerReceived
# signature / encryption state:
16..19 0001: FullyEncrypted, 0010: PartiallyEncrypted, 1xxx: Problem
20..23 0001: FullySigned, 0010: PartiallySigned, 1xxx: Problem
# "workflow stati":
24..25 01: Important, 10: ToDo, 11: Later (from Moz/Evo)
26..27 01: Work, 10: Personal, 11: reserved (dto.)
28..29 01: ThreadWatched, 10: ThreadIgnored, 11: reserved
30..31 reserved
All bits and combinations marked as reserved MUST NOT be altered if
set and MUST be set to zero (0) when creating the bitfield.
\section headers Headers (Threading and Sorting)
Contact Till Adam <[email protected]> or
Don Sanders <[email protected]>
with questions...
Threading and sorting is implemented in kmheaders.[cpp|h] and headeritem.[cpp|h]
and involves a handfull of players, namely:
class KMHeaders:
this is the listview that contains the subject, date etc of each mail.
It's a singleton, which means there is only one, per mainwidget headers
list. It is actually a member of KMMainwidget and accessible there.
class HeaderItem:
these are the [Q|K]ListViewItem descendend items the KMHeaders listview
consists of. There's one for each message.
class SortCacheItem:
these are what the threading and sorting as well as the caching of those
operate on. Each is paired with a HeaderItem, such that each holds a
pointer to one, and vice versa, each header item holds a pointer to it's
associated sort cache item.
.sorted file:
The order of the sorted and threaded (if threading is turned on for this
folder) messages is cached on disk in a file named .$FOLDER.index.sorted
so if, for example the name of a folder is foo, the associated sorting
cache file would be called ".foo.index.sorted".
For each message, its serial number, that of its parent, the length of its
sorting key, and the key itself are written to this file. At the start of
the file several per folder counts and flags are cached additionally,
immediately after a short file headers. The entries from the start of the
file are:
- "## KMail Sort V%04d\n\t" (magic header and version string)
- byteOrder flag containing 0x12345678 for byte order testing
- column, the sort column used for the folder
- ascending, the sort direction
- threaded, is the view threaded or is it not?
- appended, have there been items appended to the file (obsolete?)
- discovered_count, number of new mail discovered since the last sort file
generation
- sorted_count, number of sorted messages in the header list
What is used for figuring out threading?
- messages can have an In-Reply-To header that contains the message id of
another message. This is called a perfect parent.
- additionally there is the References header which, if present, holds a
list of message ids that the current message is a follow up to. We
currently use the second to last entry in that list only. See further
down for the reasoning behind that.
- If the above two fail and the message is prefixed (Re: foo, Fwd: foo etc.)
an attempt is made to find a parent by looking for messages with the same
subject. How that is done is explained below as well.
For all these comparisons of header contents, the md5 hashes of the headers
are used for speed reasons. They are stored in the index entry for each
message. All data structures described below use md5 hash strings unless
stated otherwise.
Strategy:
When a folder is opened, updateMessageList is called, which in turn calls
readSortOrder where all the fun happens. If there is a .sorted file at the
expected location, it is openend and parsed. The header flags are read in
order to figure out the state in which this .sorted file was written. This
means the sort direction, whether the folder is threaded or not etc.
FIXME: is there currently sanity checking going on?
Now the file is parsed and for each message in the file a SortCacheItem is
created, which holds the offset in the .sorted file for this message as well
as it's sort key as read from the file. That sort cache item is entered into
the global mSortCache structure (member of the KMHeaders instance), which is
a QMemArray<SortCacheItem *> of the size mFolder->count(). Note that this
is the size reported by the folder, not as read from the .sorted file. The
mSortCache (along with some other structures) is updated when messages are
added or removed. More on that further down.
As soon as the serial number is read from the file, that number is looked up
in the message dict, to ensure it is still in the current folder. If not, it
has been moved away in the meantime (possibly by outside forces such as
other clients) and a deleted counter is incremented and all further
processing stopped for this message.
The messages parent serial number, as read from the sorted file is then
used to look up the parent and reset it to -1 should it not be in the
current folder anymore. -1 and -2 are special values that can show up
as parent serial numbers and are used to encode the following:
-1 means this message has no perfect parent, a parent for it needs to
be found from among the other messages, if there is a suitable one
-2 means this message is top level and will forever stay so, no need
to even try to find a parent. This is also used for the non-threaded
case. These are messages that have neither an In-Reply-To header nor
a References header and have a subject that is not prefixed.
In case there is a perfect parent, the current sort cache item is
appended to the parents list of unsorted tqchildren, or to that of
root, if there is not. A sort cache item is created in the mSortCache
for the parent, if it is not already there. Messages with a parent of
-1 are appended to the "unparented" list, which is later traversed and
its elements threaded. Messages with -2 as the parent are tqchildren of
root as well, as noted above, and will remain so.
Once the end of the file is reached, we should have a nicely filled
mSortCache, containing a sort cache item for each message that was in the
sorted file. Messages with perfect parents know about them, top level
messages know about that as well, all others are on a list and will be
threaded later.
Now, what happens when messages have been added to the store since the last
update of the .sorted file? Right, they don't have a sort cache item yet,
and would be overlooked. Consequently all message ids in the folder from 0
to mFolder->count() are looked at and a SortCacheItem is created for the
ones that do not have one yet. This is where all sort cache items are created
if there was no sorted file. The items created here are by definition un-
sorted as well as unparented. On creation their sort key is figured out as
well.
The next step is finding parents for those messages that are either new, or
had a parent of -1 in the .sorted file. To that end, a dict of all sort
cache items indexed by the md5 hash of their messsage id headers is created,
that will be used for looking up sort cache items by message id. The list of
yet unparented messages is then traversed and findParent() called for each
element wihch checks In-Reply-To and References headers and looks up the
sort cache item of those parents in the above mentioned dict. Should none be
found, the item is added to a second list the items of which will be subject
threaded.
How does findParent() work, well, it tries to find the message pointed to by
the In-Reply-To header and if that fails looks for the one pointed to by the
second to last entry of the References header list. Why the second to last?
Imagine the following situation in Bob's kmail:
- Bob get's mail from Alice
- Bob replies to Alice's mail, but his mail is stored in the Outbox, not the
Inbox.
- Alice replies again, so Bob now has two mails from Alice which are part of
the same thread. The mail in the middle is somewhere else. We want that to
thread as follows:
Bob <- In-Reply-To: Alice1
============
Alice1
|_Alice <- In-Reply-To: Bob (not here), References: Alice, Bob
- since the above is a common special case, it is worth trying. I think. ;)
If the parent found is perfect (In-Reply-To), the sort cache items is marked
as such. mIsImperfectlyThreaded is used for that purposer, we will soon see
why that flag is needed.
On this first pass, no subject threading is attempted yet. Once it is done,
the messages that are now top-level, the current thread heads, so to speak,
are collected into a second dict ( QDict< QPtrList< SortCacheItem > > )
that contains for each different subject an entry holding a list of (so far
top level) messages with that subject, that are potential parents for
threading by subjects. These lists are sorted by date, so the parent closest
by date can be chosen. Sorting of these lists happens via insertion sort
while they are built because not only are they expected to be short (apart
from hard corner cases such as cvs commit lists, for which subject threading
makes little sense anyhow and where it should be turned off), but also since
the messages should be roughly pre sorted by date in the store already.
Some cursory benchmarking supports that assumption.
If now a parent is needed for a message with a certain subject, the list of
top level messages with that subject is traversed and the first one that is
older than our message is chosen as it's parent. Parents more than six weeks
older than the message are not accepted. The reasoning being that if a new
message with the same subject turns up after such a long time, the chances
that it is still part of the same thread are slim. The value of six weeks
is chosen as a result of a poll conducted on #kde-devel, so it's probably
bogus. :) All of this happens in the aptly named: findParentBySubject().
Everthing that still has no parent after this ends up at toplevel, no further
attemp is made at finding one. If you are reading this because you want to
implement threading by body search or somesuch, please go away, I don't like
you. :)
Ok, so far we have only operated on sort cache items, nothing ui wise has
happened yet. Now that we have established the parent/child relations of all
messages, it's time to create HeaderItems for them for use in the header
list. But wait, you say, what about sorting? Wouldn't it make sense to do
that first? Exactly, you're a clever bugger, ey? Here, have a cookie. ;)
Both creation of header items and sorting of the as of yet unsorted sort
cache items happen at the same time.
As previously mentioned (or not) each sort cache item holds a list of its
sorted and one of its unsorted tqchildren. Starting with the root node the
unsorted list is first qsorted, and then merged with the list of already
sorted tqchildren. To achieve that, the heads of both lists are compared and
the one with the "better" key is added to the list view next by creating a
KMHeaderListItem for it. That header item receives both its sort key as well
as its id from the sort cache item. Should the current sort cache item have
tqchildren, it is added to the end of a queue of nodes to repeat these steps
on after the current level is sorted. This way, a breadth first merge sort
is performed on the sort cache items and header items are created at each
node.
What follows is another iteration over all message ids in the folder, to
make sure that there are HeaderItems for each now. Should that not be the
case, top level emergency items are created. This loop is also used to add
all header items that are imperfectly threaded to a list so they can be
reevalutated when a new message arrives. Here the reverse mapping from
header items to sort cache items are made as well. Those are also necessary
so the sort cache item based data structures can be updated when a message
is removed.
The rest of readSortOrder should make sense on itself, I guess, if not, drop
me an email.
What happens when a message arrives in the folder?
Among other things, the msgAdded slot is called, which creates the necessary
sort cache item and header item for the new message and makes sure the data
structures described above are updated accordingly. If threading is enabled,
the new message is threaded using the same findParent and findParentBySubject
methods used on folder open. If the message ends up in a watched or ignored
thread, those status bits are inherited from the parent. The message is also
added to the dict of header items, the index of messages by message id and,
if applicable and if the message is threaded at top level, to the list of
potential parents for subject threading.
After those house keeping tasks are performed, the list of as of yet imper-
fectly threaded messages is traversed and our newly arrived message is
considered as a new parent for each item on it. This is especially important
to ensure that parents arriving out of order after their tqchildren still end
up as parents. If necessary, the entries in the .sorted file of rethreaded
messages are updated. An entry for the new message itself is appended to the
.sorted file as well.
Note that as an optimization newly arriving mail as part of a mailcheck in
an imap folder is not added via msgAdded, but rather via complete reload of
the folder via readSortOrder(). That's because only the headers are gotten
for each mail on imap and that is so fast, that adding them individually to
the list view is very slow by comparison. Thus we need to make sure that
writeSortOrder() is called whenever something related to sorting changes,
otherwise we read stale info from the .sorted file. The reload is triggered
by the folderComplete() signal of imap folders.
What happens when a message is removed from the folder?
In this case the msgRemoved slot kicks in and updates the headers list. First
the sort cache item and header item representing our message are removed from
the data structures and the ids of all items after it in the store decre-
mented. Then a list of tqchildren of the message is assembled containing those
tqchildren that have to be reparented now that our message has gone away. If
one of those tqchildren has been marked as toBeDeleted, it is simply added to
root at top level, because there is no need to find a parent for it if it is
to go away as well. This is an optimization to avoid rethreading all
messages in a thread when deleting several messages in a thread, or even the
whole thread. The KMMoveCommand marks all messages that will be moved out of
the folder as such so that can be detected here and the useless reparenting
can be avoided. Note that that does not work when moving messages via filter
action.
That list of tqchildren is then traversed and a new parent found for each one
using, again, findParent and findParentBySubject. When a message becomes
imperfectly threaded in the process, it is added to the corresponding list.
The message itself is removed from the list of imperfectly threaded messages
and its header item is finally deleted. The HeaderItem destructor destroys
the SortCacheItem as well, which is hopefully no longer referenced anywhere
at this point.
\section display DISPLAY (reader window - new)
Contact Marc Mutz <[email protected]> with questions...
What happens when a message is to displayed in the reader window?
First, since KMMessagePart and KMMessage don't work with nested body
parts, a hierarchical tree of MIME body parts is constructed with
partNode's (being wrappers around KMMessagePart and
DwBodyPart). This is done in KMReaderWin::parseMsg(KMMessage*).
After some legacy handling, an ObjectTreeParser is instantiated and
it's parseObjectTree() method called on the root
partNode. ObjectTreeParser is the result of an ongoing refactoring
to enhance the design of the message display code. It's an
implementation of the Method Object pattern, used to break down a
huuuge method into many smaller ones without the need to pass around
a whole lot of paramters (those are made members
instead). parseObjectTree() recurses into the partNode tree and
generates an HTML representation of each node in the tree (although
some can be hidden by setting them to processed beforehand or - the
new way - by using AttachmentStrategy::Hidden). The HTML generation
is delegated to BodyPartFormatters, while the HTML is written to
HTMLWriters, which abstract away HTML sinks. One of those is
KHTMLPartHTMLWriter, which writes to the KHTMLPart in the
readerwindow. This is the current state of the code. The goal of the
ongoing refactoring is to make the HTML generation blissfully
ignorant of the readerwindow and to allow display plugins that can
operate against stable interfaces even though the internal KMail
classes change dramatically.
To this end, we designed a new set of interfaces that allows plugins
to be written that need or want to asynchronously update their HTML
representation. This set of interfaces consists of the following:
- @em BodyPartFormatterPlugin
the plugin base class. Allows the BodyPartFormatterFactory to
query it for an arbitray number of BodyPartFormatters and
their associated metadata and url handlers.
- @em BodyPartFormatter
the formatter interface. Contains a single method format()
that takes a BodyPart to process and an HTMLWriter to write the
generated HTML to and returns a result code with which it can
request more information or delegate the formatting back to
some default processor. BodyPartFormatters are meant to be
Flyweights, implying that the format() method is not allowed
to maintain state between calls, but see Mememto below.
- @em BodyPart
body part interface. Contains methods to retrieve the content
of a message part and some of the more interesting header
information. This interface decouples the bodypart formatters
from the implementation used in KMail (or KNode, for that
matter). Also contains methods to set and retrieve a Memento,
which is used by BodyPartFormatters to maintain state between
calls of format().
- @em BodyPartMemento
interface for opaque state storage and event handling.
Contains only a virtual destructor to enable the backend
processing code to destroy the object without the help of the
bodypart formatter.
During the design phase we identified a need for BodyPartFormatters to
request their being called on some form of events, e.g. a dcop
signal. Thus, the Memento interface also includes the IObserver and
ISubject interfaces. If a BodyPartFormatter needs to react to a signal
(Qt or DCOP), it implements the Memento interface using a QObject,
connects the signal to a slot on the Memento and (as an ISubject)
notifies it's IObservers when the slot is called. If a Memento is
created, the reader window registers itself as an observer of the
Memento and will magically invoke the corresponding BodyPartFormatter,
passing along the Memento to be retrieved from the BodyPart interface.
An example should make this clearer. Suppose, we want to update our
display after 10 seconds. Initially, we just write out an icon, and
after 10 seconds, we want to replace the icon by a "Hello world!"
line. The following code does exactly this:
@code
class DelayedHelloWorldBodyPartFormatter
: public KMail::BodyPartFormatter {
public:
Result format( KMail::BodyPart * bodyPart,
KMail::HTMLWriter * htmlWriter ) {
if ( !bodyPart->memento() ) {
bodyPart->registerMemento( new DelayedHelloWorldBodyPartMemento() );
return AsIcon;
} else {
htmlWriter->write( "Hello, world!" );
return Ok;
}
}
};
class DelayedHelloWorldBodyPartMemento
: public QObject, public KMail::BodyPartMemento {
public:
DelayedHelloWorldBodyPartMemento()
: QObject( 0, "DelayedHelloWorldBodyPartMemento" ),
KMail::BodyPartMemento()
{
QTimer::singleShot( 10*1000, this, SLOT(slotTimeout()) );
}
private Q_SLOTS:
void slotTimeout() { notify(): }
private:
// need to reimplement this abstract method...
bool update() { return true; }
};
@endcode
*/
|