In user space, when open a file, user program will get a file descriptor (a integer) that represent that file. User can use this descriptor to perform various operations on this file: read, write, seek, etc. As I see it, this design is quite clean in that:
-
Hide most of the details from user, for both safety and simplicity
-
Enable more high level abstraction: everything (socket, pipe..) is a file
The file descriptor is actually an index to kernel space structure that contains all the details of opened files. So at kernel side, we need to do a lot bookkeeping stuff.
What information should be kept?
It's helpful to take a look at $OS161_SRC/kern/include/vnode.h
. In a nutshell, a file
is represented by a struct vnode
in kernel space. And most of the underlying
interfaces that help us to manage files have already been provided. All we
need to do is just bookkeeping. So basically, we need to record the following
details about a file:
-
File name. We don't actually need this, but just in case. For example, we may want to print a file's name when debuging.
-
Open flags. We need to keep the flags passed by
open
so that later on we can check permissions on read or write. -
File offset. We definitely need this.
-
File's reference counter. Mainly for
dup2
andfork
system call -
A lock to protect the access to this file descriptor. Since it's possible that two threads share the same copy of this bookkeeping data structure (e.g., after
fork
) -
Actual pointer to the file's
struct vnode
Why we didn't record the file's fd? Please see next section.
File descriptor Allocation
There are some common rules about file descriptor:
-
0, 1 and 2 are special file descriptors. They are stdin, stdout and stderr respectively. (Defined in
$OS161_SRC/kern/include/kern/unistd.h
asSTDIN_FILENO
,STDOUT_FILENO
andSTDERR_FILENO
) -
The file descriptor returned by open should be the smallest fd available. (Not compulsory though)
-
fd space is process specific, i.e. different process may get the same file descriptor that represent different files
So, to maintain each process's opened file information, we add a new field to
struct thread
/* OPEN_MAX is defined in $OS161_SRC/kern/include/limits.h */
struct fdesc* t_fdtable[OPEN_MAX];
Now you may figure out why there isn't a fd filed in struct fdesc
, since its
index is the fd! So when we need to allocate a file descriptor, we just need
to scan the t_fdtable
(from STDERR_FILENO+1
, of course), find an available
slot (NULL
) and use it. Also, since it's a struct thread
field, it's process
specific.
Does the t_fdtable
look familiar to you? Yes, it's very similar to our
process array, only that the later is system-wise. (Confused? See
my previous post on fork)
t_fdtable
Management and Special Files
Whenever you add a new field to struct thread
, don't forget to initialize
them in thread_create
and do clean up in thread_exit
and/or thread_destroy
.
Since t_fdtable
is an fixed size array, work a lot much easier: just zero
the array when create, and no clean up is needed. Also, t_fdtable
are
supposed to be inheritable: so copy a parent's t_fdtable
to child when do
sys_fork
.
Since parent and child thread are supposed to share the same file table, so when copy file tables, remember to increase each file's reference counter.
Console files (std in/out/err) are supposed to be opened "automatically" when a thread is created, i.e. user themselves don't need to open them.
At first glance, thread_create
would be a intuitive place to initialize them.
Yes, we can do that. But be noted that when the first thread is created, the console
is even not bootstrapped yet, so if you open console files in thread_create
, it'll
fail (silently blocking...) at that time.
Update: The right way to do this is to initialize console in runprogram
,
because that's where the first user thread is born. And later user threads will
just inherits the three file handles of console from then on.
BTW, how to open console? The path name should be "con:", flags should
be: O_RDONLY
for stdin, O_WRONLY
for stdout and stderr; options should be 0664
(Note the zero prefix, it's a octal number).