+++ /dev/null
-Date: Sun, 14 Sep 1997 20:17:06 -0700 (PDT)
-From: Josh MacDonald <jmacd@CS.Berkeley.EDU>
-To: gnome@athena.nuclecu.unam.mx, gtk-list@redhat.com
-Subject: [gtk-list] gtktext widget internal documentation
-
-
-Pete convinced me to just write up the text widget and let someone else
-finish it. I'm pretty busy and have other commitments now. Sorry. I think
-I'm not the most qualified for some of the remaining work anyway, because I
-don't really know Gtk and it's event model very well. Most of the work so
-far was possible without knowing Gtk all that well, it was simply a data
-structure exercise (though after reading this you might say it was a fairly
-complicated data structure exercise). I'm happy to answer questions.
-
--josh
-
-
-High level description:
-
-There are several layers of data structure to the widget. They are
-separated from each other as much as possible. The first is a gapped
-text segment similar to the data structure Emacs uses for representing
-text. Then there is a property list, which stores text properties for
-various ranges of text. There is no direct relation between the text
-property list and the gapped text segment. Finally there is a drawn
-line parameter cache to speed calculations when drawing and redrawing
-lines on screen. In addition to these data structures, there are
-structures to help iterate over text in the buffer.
-
-The gapped text segment is quite simple. It's parameters are (all
-parameters I mention here are in the structure GtkText):
-
- guchar* text;
- guint text_len;
- guint gap_position;
- guint gap_size;
- guint text_end;
-
-TEXT is the buffer, TEXT_LEN is its allocated length. TEXT_END is the
-length of the text, including the gap. GAP_POSITION is the start of
-the gap, and GAP_SIZE is the gap's length. Therefore, TEXT_END -
-GAP_SIZE is the length of the text in the buffer. The macro
-TEXT_LENGTH returns this value. To get the value of a character in
-the buffer, use the macro TEXT_INDEX(TEXT,INDEX). This macro tests
-whether the index is less than the GAP_POSITION and returns
-TEXT[INDEX] or returns TEXT[GAP_SIZE+INDEX]. The function
-MOVE_GAP_TO_POINT positions the gap to a particular index. The
-function MAKE_FORWARD_SPACE lengthens the gap to provide room for a
-certain number of characters.
-
-The property list is a doubly linked list (GList) of text property
-data for each contiguous set of characters with similar properties.
-The data field of the GList points to a TextProperty structure, which
-contains:
-
- TextFont* font;
- GdkColor* back_color;
- GdkColor* fore_color;
- guint length;
-
-Currently, only font and color data are contained in the property
-list, but it can be extended by modifying the INSERT_TEXT_PROPERTY,
-TEXT_PROPERTIES_EQUAL, and a few other procedures. The text property
-structure does not contain an absolute offset, only a length. As a
-result, inserting a character into the buffer simply requires moving
-the gap to the correct position, making room in the buffer, and either
-inserting a new property or extending the old one. This logic is done
-by INSERT_TEXT_PROPERTY. A similar procedure exists to delete from
-the text property list, DELETE_TEXT_PROPERTY. Since the property
-structure doesn't contain an offset, insertion into the list is an
-O(1) operation. All such operations act on the insertion point, which
-is the POINT field of the GtkText structure.
-
-The GtkPropertyMark structure is used for keeping track of the mapping
-between absolute buffer offsets and positions in the property list.
-These will be referred to as property marks. Generally, there are
-four property marks the system keeps track of. Two are trivial, the
-beginning and the end of the buffer are easy to find. The other two
-are the insertion point (POINT) and the cursor point (CURSOR_MARK).
-All operations on the text buffer are done using a property mark as a
-sort of cursor to keep track of the alignment of the property list and
-the absolute buffer offset. The GtkPropertyMark structure contains:
-
- GList* property;
- guint offset;
- guint index;
-
-PROPERTY is a pointer at the current property list element. INDEX is
-the absolute buffer index, and OFFSET is the offset of INDEX from the
-beginning of PROPERTY. It is essential to keep property marks valid,
-or else you will have the wrong text properties at each property mark
-transition. An important point is that all property marks are invalid
-after a buffer modification unless care is taken to keep them
-accurate. That is the difficulty of the insert and delete operations,
-because as the next section describes, line data is cached and by
-necessity contains text property marks. The functions for operating
-and computing property marks are:
-
- void advance_mark (GtkPropertyMark* mark);
- void decrement_mark (GtkPropertyMark* mark);
- void advance_mark_n (GtkPropertyMark* mark, gint n);
- void decrement_mark_n (GtkPropertyMark* mark, gint n);
- void move_mark_n (GtkPropertyMark* mark, gint n);
-
- GtkPropertyMark find_mark (GtkText* text, guint mark_position);
- GtkPropertyMark find_mark_near (GtkText* text, guint mark_position,
- const GtkPropertyMark* near);
-
-ADVANCE_MARK and DECREMENT_MARK modify the mark by plus or minus one
-buffer index. ADVANCE_MARK_N and DECREMENT_MARK_N modify the mark by
-plus or minus N indices. MOVE_MARK_N accepts a positive or negative
-argument. FIND_MARK returns a mark at MARK_POSITION using a linear
-search from the nearest known property mark (the beginning, the end,
-the point, etc). FIND_MARK_NEAR also does a linear search, but
-searches from the NEAR argument. A number of macros exist at the top
-of the file for doing things like getting the current text property,
-or some component of the current property. See the MARK_* macros.
-
-Next there is a LineParams structure which contains all the
-information necessary to draw one line of text on screen. When I say
-"line" here, I do not mean one line of text separated by newlines,
-rather I mean one row of text on screen. It is a matter of policy how
-visible lines are chosen and there are currently two policies,
-line-wrap and no-line-wrap. I suspect it would not be difficult to
-implement new policies for doing such things as justification. The
-LineParams structure includes the following fields:
-
- guint font_ascent;
- guint font_descent;
- guint pixel_width;
- guint displayable_chars;
- guint wraps : 1;
-
- PrevTabCont tab_cont;
- PrevTabCont tab_cont_next;
-
- GtkPropertyMark start;
- GtkPropertyMark end;
-
-FONT_ASCENT and FONT_DESCENT are the maximum ascent and descent of any
-character in the line. PIXEL_WIDTH is the number of pixels wide the
-drawn region is, though I don't think it's actually being used
-currently. You may wish to remove this field, eventually, though I
-suspect it will come in handy implementing horizontal scrolling.
-DISPLAYABLE_CHARS is the number of characters in the line actually
-drawn. This may be less than the number of characters in the line
-when line wrapping is off (see below). The bitflag WRAPS tells
-whether the next line is a continuation of this line. START and END
-are the marks at the beginning and end of the line. Note that END is
-the actual last character, not one past it, so the smallest line
-(containing, for example, one newline) has START == END. TAB_CONT and
-TAB_CONT_NEXT are for computation of tab positions. I will discuss
-them later.
-
-A point about the end of the buffer. You may be tempted to consider
-working with the buffer as an array of length TEXT_LENGTH(TEXT), but
-you have to be careful that the editor allows you to position your
-cursor at the last index of the buffer, one past the last character.
-The macro LAST_INDEX(TEXT, MARK) returns true if MARK is positioned at
-this index. If you see or add a special case in the code for this
-end-of-buffer case, make sure to use LAST_INDEX if you can. Very
-often, the last index is treated as a newline.
-
-[ One way the last index is special is that, although it is always
- part of some property, it will never be part of a property of
- length 1 unless there are no other characters in the text. That
- is, its properties are always that of the preceding character,
- if any.
-
- There is a fair bit of special case code to maintain this condition -
- which is needed so that user has control over the properties of
- characters inserted at the last position. OWT 2/9/98 ]
-
-Tab stops are variable width. A list of tab stops is contained in the
-GtkText structure:
-
- GList *tab_stops;
- gint default_tab_width;
-
-The elements of tab_stops are integers casted to gpointer. This is a
-little bogus, but works. For example:
-
- text->default_tab_width = 4;
- text->tab_stops = NULL;
- text->tab_stops = g_list_prepend (text->tab_stops, (void*)8);
- text->tab_stops = g_list_prepend (text->tab_stops, (void*)8);
-
-is how these fields are initialized, currently. This means that the
-first two tabs occur at 8 and 16, and every 4 characters thereafter.
-Tab stops are used in the computation of line geometry (to fill in a
-LineParams structure), and the width of the space character in the
-current font is used. The PrevTabCont structure, of which two are
-stored per line, is used to compute the geometry of lines which may
-have wrapped and carried part of a tab with them:
-
- guint pixel_offset;
- TabStopMark tab_start;
-
-PIXEL_OFFSET is the number of pixels at which the line should start,
-and tab_start is a tab stop mark, which is similar to a property mark,
-only it keeps track of the mapping between line position (column) and
-the next tab stop. A TabStopMark contains:
-
- GList* tab_stops;
- gint to_next_tab;
-
-TAB_STOPS is a pointer into the TAB_STOPS field of the GtkText
-structure. TO_NEXT_TAB is the number of characters before the next
-tab. The functions ADVANCE_TAB_MARK and ADVANCE_TAB_MARK_N advance
-these marks. The LineParams structure contains two PrevTabCont
-structures, which each contain a tab stop. The first (TAB_CONT) is
-for computing the beginning pixel offset, as mentioned above. The
-second (TAB_CONT_NEXT) is used to initialize the TAB_CONT field of the
-next line if it wraps.
-
-Since computing the parameters of a line are fairly complicated, I
-have one interface that should be all you ever need to figure out
-something about a line. The function FIND_LINE_PARAMS computes the
-parameters of a single line. The function LINE_PARAMS_ITERATE is used
-for computing the properties of some number (> 0) of sequential lines.
-
-void
-line_params_iterate (GtkText* text,
- const GtkPropertyMark* mark0,
- const PrevTabCont* tab_mark0,
- gboolean alloc,
- gpointer data,
- LineIteratorFunction iter);
-
-where LineIteratorFunction is:
-
-typedef gint (*LineIteratorFunction) (GtkText* text,
- LineParams* lp,
- gpointer data);
-
-The arguments are a text widget (TEXT), the property mark at the
-beginning of the first line (MARK0), the tab stop mark at the
-beginning of that line (TAB_MARK0), whether to heap-allocate the
-LineParams structure (ALLOC), some client data (DATA), and a function
-to call with the parameters of each line. TAB_MARK0 may be NULL, but
-if so MARK0 MUST BE A REAL LINE START (not a continued line start; it
-is preceded by a newline). If TAB_MARK0 is not NULL, MARK0 may be any
-line start (continued or not). See the code for examples. The
-function ITER is called with each LineParams computed. If ALLOC was
-true, LINE_PARAMS_ITERATE heap-allocates the LineParams and does not
-free them. Otherwise, no storage is permanently allocated. ITER
-should return TRUE when it wishes to continue no longer.
-
-There are currently two uses of LINE_PARAMS_ITERATE:
-
-* Compute the total buffer height for setting the parameters of the
- scroll bars. This is done in SET_VERTICAL_SCROLL each time the
- window is resized. When horizontal scrolling is added, depending on
- the policy chosen, the max line width can be computed here as well.
-
-* Computing geometry of some pixel height worth of lines. This is
- done in FETCH_LINES, FETCH_LINES_BACKWARD, FETCH_LINES_FORWARD, etc.
-
-The GtkText structure contains a cache of the LineParams data for all
-visible lines:
-
- GList *current_line;
- GList *line_start_cache;
-
- guint first_line_start_index;
- guint first_cut_pixels;
- guint first_onscreen_hor_pixel;
- guint first_onscreen_ver_pixel;
-
-LINE_START_CACHE is a doubly linked list of LineParams. CURRENT_LINE
-is a transient piece of data which is set in various places such as
-the mouse click code. Generally, it is the line on which the cursor
-property mark CURSOR_MARK is on. LINE_START_CACHE points to the first
-visible line and may contain PREV pointers if the cached data of
-offscreen lines is kept around. I haven't come up with a policy. The
-cache can keep more lines than are visible if desired, but the result
-is that inserts and deletes will then become slower as the entire
-cache has to be "corrected". Right now it doesn't delete from the
-cache (it should). As a result, scrolling through the whole buffer
-once will fill the cache with an entry for each line, and subsequent
-modifications will be slower than they should
-be. FIRST_LINE_START_INDEX is the index of the *REAL* line start of
-the first line. That is, if the first visible line is a continued
-line, this is the index of the real line start (preceded by a
-newline). FIRST_CUT_PIXELS is the number of pixels which are not
-drawn on the first visible line. If FIRST_CUT_PIXELS is zero, the
-whole line is visible. FIRST_ONSCREEN_HOR_PIXEL is not used.
-FIRST_ONSCREEN_VER_PIXEL is the absolute pixel which starts the
-visible region. This is used for setting the vertical scroll bar.
-
-Other miscellaneous things in the GtkText structure:
-
-Gtk specific things:
-
- GtkWidget widget;
-
- GdkWindow *text_area;
-
- GtkAdjustment *hadj;
- GtkAdjustment *vadj;
-
- GdkGC *gc;
-
- GdkPixmap* line_wrap_bitmap;
- GdkPixmap* line_arrow_bitmap;
-
-These are pretty self explanatory, especially if you know Gtk.
-LINE_WRAP_BITMAP and LINE_ARROW_BITMAP are two bitmaps used to
-indicate that a line wraps and is continued offscreen, respectively.
-
-Some flags:
-
- guint has_cursor : 1;
- guint is_editable : 1;
- guint line_wrap : 1;
- guint freeze : 1;
- guint has_selection : 1;
- guint own_selection : 1;
-
-HAS_CURSOR is true iff the cursor is visible. IS_EDITABLE is true iff
-the user is allowed to modify the buffer. If IS_EDITABLE is false,
-HAS_CURSOR is guaranteed to be false. If IS_EDITABLE is true,
-HAS_CURSOR starts out false and is set to true the first time the user
-clicks in the window. LINE_WRAP is where the line-wrap policy is
-set. True means wrap lines, false means continue lines offscreen,
-horizontally.
-
-The text properties list:
-
- GList *text_properties;
- GList *text_properties_end;
-
-A scratch area used for constructing a contiguous piece of the buffer
-which may otherwise span the gap. It is not strictly necessary
-but simplifies the drawing code because it does not need to deal with
-the gap.
-
- guchar* scratch_buffer;
- guint scratch_buffer_len;
-
-The last vertical scrollbar position. Currently this looks the same
-as FIRST_ONSCREEN_VER_PIXEL. I can't remember why I have two values.
-Perhaps someone should clean this up.
-
- gint last_ver_value;
-
-The cursor:
-
- gint cursor_pos_x;
- gint cursor_pos_y;
- GtkPropertyMark cursor_mark;
- gchar cursor_char;
- gchar cursor_char_offset;
- gint cursor_virtual_x;
- gint cursor_drawn_level;
-
-CURSOR_POS_X and CURSOR_POS_Y are the screen coordinates of the
-cursor. CURSOR_MARK is the buffer position. CURSOR_CHAR is
-TEXT_INDEX (TEXT, CURSOR_MARK.INDEX) if a drawable character, or 0 if
-it is whitespace, which is treated specially. CURSOR_CHAR_OFFSET is
-the pixel offset above the base of the line at which it should be
-drawn. Note that the base of the line is not the "baseline" in the
-traditional font metric sense. A line (LineParams) is
-FONT_ASCENT+FONT_DESCENT high (use the macro LINE_HEIGHT). The
-"baseline" is FONT_DESCENT below the base of the line. I think this
-requires a drawing.
-
-0 AAAAAAA
-1 AAAAAAA
-2 AAAAAAAAA
-3 AAAAAAAAA
-4 AAAAA AAAAA
-5 AAAAA AAAAA
-6 AAAAA AAAAA
-7 AAAAA AAAAA
-8 AAAAA AAAAA
-9 AAAAAAAAAAAAAAAAA
-10 AAAAAAAAAAAAAAAAA
-11 AAAAA AAAAA
-12 AAAAA AAAAA
-13 AAAAAA AAAAAA
-14______________AAAAA___________AAAAA__________________________________
-15
-16
-17
-18
-19
-20
-
-This line is 20 pixels high, has FONT_ASCENT=14, FONT_DESCENT=6. It's
-"base" is at y=20. Characters are drawn at y=14. The LINE_START
-macro returns the pixel height. The LINE_CONTAINS macro is true if
-the line contains a certain buffer index. The LINE_STARTS_AT macro is
-true if the line starts at a certain buffer index. The
-LINE_START_PIXEL is the pixel offset the line should be drawn at,
-according the the tab continuation of the previous line.
-
-Exposure and drawing:
-
-Exposure is handled from the EXPOSE_TEXT function. It assumes that
-the LINE_START_CACHE and all its parameters are accurate and simply
-exposes any line which is in the exposure region. It calls the
-CLEAR_AREA function to clear the background and/or lay down a pixmap
-background. The text widget has a scrollable pixmap background, which
-is implemented in CLEAR_AREA. CLEAR_AREA does the math to figure out
-how to tile the pixmap itself so that it can scroll the text with a
-copy area call. If the CURSOR argument to EXPOSE_TEXT is true, it
-also draws the cursor.
-
-The function DRAW_LINE draws a single line, doing all the tab and
-color computations necessary. The function DRAW_LINE_WRAP draws the
-line wrap bitmap at the end of the line if it wraps. TEXT_EXPOSE will
-expand the cached line data list if it has to by calling
-FETCH_LINES_FORWARD. The functions DRAW_CURSOR and UNDRAW_CURSOR draw
-and undraw the cursor. They count the number of draws and undraws so
-that the cursor may be undrawn even if the cursor is already undrawn
-and the re-draw will not occur too early. This is useful in handling
-scrolling.
-
-Handling of the cursor is a little messed up, I should add. It has to
-be undrawn and drawn at various places. Something better needs to be
-done about this, because it currently doesn't do the right thing in
-certain places. I can't remember where very well. Look for the calls
-to DRAW_CURSOR and UNDRAW_CURSOR.
-
-RECOMPUTE_GEOMETRY is called when the geometry of the window changes
-or when it is first drawn. This is probably not done right. My
-biggest weakness in writing this code is that I've never written a
-widget before so I got most of the event handling stuff wrong as far
-as Gtk is concerned. Fortunately, most of the code is unrelated and
-simply an exercise in data structure manipulation.
-
-Scrolling:
-
-Scrolling is fairly straightforward. It looks at the top line, and
-advances it pixel by pixel until the FIRST_CUT_PIXELS equals the line
-height and then advances the LINE_START_CACHE. When it runs out of
-lines it fetches more. The function SCROLL_INT is used to scroll from
-inside the code, it calls the appropriate functions and handles
-updating the scroll bars. It dispatches a change event which causes
-Gtk to call the correct scroll action, which then enters SCROLL_UP or
-SCROLL_DOWN. Careful with the cursor during these changes.
-
-Insertion, deletion:
-
-There's some confusion right now over what to do with the cursor when
-it's offscreen due to scrolling. This is a policy decision. I don't
-know what's best. Spencer criticized me for forcing it to stay
-onscreen. It shouldn't be hard to make stuff work with the cursor
-offscreen.
-
-Currently I've got functions to do insertion and deletion of a single
-character. It's fairly complicated. In order to do efficient pasting
-into the buffer, or write code that modifies the buffer while the
-buffer is drawn, it needs to do multiple characters at at time. This
-is the hardest part of what remains. Currently, gtk_text_insert does
-not re-expose the modified lines. It needs to. Pete did this wrong at
-one point and I disabled modification completely, I don't know what
-the current state of things are. The functions
-INSERT_CHAR_LINE_EXPOSE and DELETE_CHAR_LINE_EXPOSE do the work.
-Here's pseudo code for insert. Delete is quite similar.
-
- insert character into the buffer
- update the text property list
- move the point
- undraw the cursor
- correct all LineParams cache entries after the insertion point
- compute the new height of the modified line
- compare with the old height of the modified line
- remove the old LineParams from the cache
- insert the new LineParams into the cache
- if the lines are of different height, do a copy area to move the
- area below the insertion down
- expose the current line
- update the cursor mark
- redraw the cursor
-
-What needs to be done:
-
-Horizontal scrolling, robustness, testing, selection handling. If you
-want to work in the text widget pay attention to the debugging
-facilities I've written at the end of gtktext.c. I'm sorry I waited
-so long to try and pass this off. I'm super busy with school and
-work, and when I have free time my highest priority is another version
-of PRCS.
-
-Feel free to ask me questions.