Age | Commit message (Collapse) | Author |
|
This is around 3x or so speedup, since we would read wide rows at a time, and
clflush each tile 8 times as a result. We'll want code related to this anyway
when we do fault-based per-page clflushing for sw fallbacks.
|
|
|
|
Fixes an oops in fbotexture from walking off the end of the page list.
|
|
This increases overhead for the large-readpixels case due to the repeated
page cache accessing, but greatly reduces overhead for the small-readpixels
case.
|
|
These will be covered by the fence, while pread/pwrite are supposed to be
CPU-perspective writes, with manual detiling done by the client.
|
|
|
|
This is in pci.h in the fixed patch to the kernel.
|
|
This requires an updated 2D driver to not try to set it up as well.
|
|
On some distros missing prototypes cause kernel builds to fail. These
are hack to make the code build.
|
|
|
|
One of our systems has been returning 0xffffffff from all MCHBAR reads, which
means we'll need to figure out why, or add an alternate detection method.
|
|
Various chips have exciting interactions between the CPU and the GPU's
different ways of accessing interleaved memory, so we need some kernel
assistance in determining how it works.
Only fully tested on GM965 so far.
|
|
|
|
With the interrupt enable/disable using only the mask register, it was wrong
to use the enable register to detect which pipes had vblank detection
turned on. Also, as we keep a local copy of the mask register around, and
MSI machines smack the hardware during the interrupt handler, it is more
efficient and more correct to use the local copy.
|
|
This shares common code sequences for managing the interrupt register bits
|
|
|
|
|
|
A mis-spelled config option (was it spelled that way in the past?)
eliminated kmap_atomic_prot_pfn from core DRM.
|
|
I915_GEM_DOMAIN_CPU is very expensive to wait for -- it generally requires
clflushing the frame buffer.
|
|
Clean up queues, free objects. On the next entervt, unmark the hardware to
let the user try again (presumably after resetting the chip). Someday we'll
automatically recover...
|
|
Pin/copy_from_user/unpin through the GTT to eliminate clflush costs.
Benchmarks say this helps quite a bit.
|
|
|
|
|
|
A mis-spelled config option (was it spelled that way in the past?)
eliminated kmap_atomic_prot_pfn from core DRM.
|
|
I915_GEM_DOMAIN_CPU is very expensive to wait for -- it generally requires
clflushing the frame buffer.
|
|
While debugging the 915, I tried this trick there and accidentally left it
set.
|
|
Clean up queues, free objects. On the next entervt, unmark the hardware to
let the user try again (presumably after resetting the chip). Someday we'll
automatically recover...
|
|
|
|
Pin/copy_from_user/unpin through the GTT to eliminate clflush costs.
Benchmarks say this helps quite a bit.
|
|
still not sure which works best on which hardware; this will make it easier
to experiment.
|
|
Noting that the interrupt mask register was more reliable than the interrupt
enable register for managing interrupts in user_irq_on/user_irq_off, this
patch replaces the remaining IER frobbing with IMR instead.
The test which exposes IER related failures is:
$ glxgears & glxgears & glxgears
(reposition the glxgears windows away from the upper left corner)
$ while :; do x11perf -rect100 -reps 800 -repeat 1; sleep 1; done &
$ while :; do runoa; runet; done &
|
|
This tracks most of the interrupt-related status, including the
interrupt registers in the chip and the sequence number variables.
|
|
Another patch adds this to a /proc/dri file for debugging and monitoring.
|
|
|
|
While waiting for the hardware to idle on leavevt or lastclose, poll
for the sync sequence number instead of waiting for an interrupt. This
allows the code to bail if the hardware hangs for some reason. Also, this
avoids issues with signals as the exisiting wait function is interruptible.
|
|
This adds gem_active, gem_flushing, gem_inactive, gem_request and gem_seqno
entries to monitor gem operation and help debug issues.
|
|
This allows device drivers to add proc files
|
|
find_or_create_page doesn't quite set up pages correctly; any newly created
pages aren't hooked into the shmem object quite right; user space mmaps of
those pages end up mapping pages full of zeros which then get written to the
real pages inappropriately. This patch requires that the kernel export
shmem_getpage.
|
|
When a software fallback has completed, usermode must notify the kernel so
that any scanout buffers can be synchronized. This ioctl should be called
whenever a fallback completes to flush CPU and chipset caches.
|
|
The PCI caps register reports MSI support even though it isn't really there.
|
|
This fixes registration when MSI is set up after the stub function fills in
dev->irq. Otherwise /proc/interrupts would report attachment to the fasteoi
interrupt. dev->irq is still exposed (and updated at IRQ setup)
for the drivers that use it for whatever reason.
|
|
In leavevt_ioctl, queue an MI_FLUSH and then block waiting for it to
complete. This will empty the active and flushing lists. That leaves only
the inactive list to evict.
|
|
Pin/unpin need to know whether to remove/add objects from the inactive list,
inactive objects cannot be in any GPU write domain as those would be on the
flushing list instead. However, inactive objects may be in the CPU write
domain.
|
|
Now that gem_object_unbind waits for rendering to complete, objects should
not be active when they are being pulled from the GTT. BUG_ON if this is
broken.
|
|
Inactive list elements may not be pinned, active or have non-CPU write
domains.
|
|
|
|
|
|
Moving to the CPU domain doesn't ensure that rendering is finished, the
buffer may still be in use as a texture or other data source.
|
|
Receiving a signal should be ignored by the library, so just restart any
ioctl which returns EINTR or EAGAIN.
|
|
Not quite portable, but these are useful for intel. Some more general
mechanism could be done...
|