Age | Commit message (Collapse) | Author |
|
Using this call in OUT_BATCH_TABLE reduces radeonEmitState cpu usage from
9% to 5% and emit_vpu goes from 7% to 1.5%. I did use calgrind to profile
gears for cpu hotspots with r500 card.
Signed-off-by: Pauli Nieminen <suokkos@gmail.com>
|
|
GCC did war about optimization not possible because possible forever loop.
Signed-off-by: Pauli Nieminen <suokkos@gmail.com>
|
|
Signed-off-by: Pauli Nieminen <suokkos@gmail.com>
|
|
|
|
Nasty, but nicer than silently not writing into the pushbuf
|
|
|
|
Noticed by vehemens on irc.
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
Signed-off-by: Eric Anholt <eric@anholt.net>
|
|
|
|
This caches the mapping and just use mapping as a sync point
|
|
|
|
|
|
|
|
inbalances cpu_prep/cpu_finish
- The bo was mapped with sysmem == NULL, so this means cpu prep is called.
- The bo was unmapped with sysmem != NULL, so this means cpu finish is not called.
- This can lead to a non-zero "cpu writers" count in ttm_bo.
|
|
The goal of the BO cache is to keep buffers on hand for fast continuous use,
as in every frame of a game or every batchbuffer of the X Server. Keeping
older buffers on hand not only doesn't serve this purpose, it may hurt
performance by resulting in disk cache getting kicked out, or even driving
the system to swap.
Bug #20766.
|
|
|
|
If call was interrupted by signal we have to make call again.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
|
|
The logbase2 would overflow and wrap the size around to 0, making the code
allocate a 4kb object instead. By simplifying the code to just walk the
14-entry bucket array comparing sizes instead of indexing on
ffs(1 << logbase2(size)), we avoid silly math errors and have code of
approximately the same speed.
Many thanks to Simon Farnsworth for debugging and providing a working patch.
Bug #27365.
|
|
bug #21999
|
|
integers.
|
|
Based on patch by Pauli Nieminen. Thanks.
|
|
This ports a lot of the space checking code into a the common
library, so that the DDX and mesa can use it.
|
|
We always realloc at least 0x1000 dwords (page on most system)
when growing the cs buffer this is to avoid having to realloc
at each cs_begin.
|
|
This should use ndw not cdw, using cdw leads to realloc alignment going wrong
|
|
|
|
the DDX does this and used to handle it internally
|
|
requires --enable-radeon-experimental-api for now
|
|
Normal map() should operate as before, and map_range()/map_flush() should
give correct results but lacking any performance difference from map().
Nothing exiting being done here yet, but the interface is a good start.
|
|
|
|
Fixes the dri1 gallium driver if the front buffer happens to be non-linear.
|
|
|
|
|
|
This avoids making objects significantly bigger than they would be
otherwise, which would result in some failing at binding to the GTT.
Found from firefox hanging on:
http://upload.wikimedia.org/wikipedia/commons/b/b7/Singapore_port_panorama.jpg
due to a software fallback trying to do a GTT-mapped copy between two 73MB
BOs that were instead each 128MB, and failing because both couldn't fit
simultaneously.
The cost here is that we get no opportunity to cache these objects and
avoid the mapping. But since the objects are a significant percentage
of the aperture size, each mapped access is likely having to fault and rebind
the object most of the time anyway.
Bug #20152 (2/3)
|
|
The convention is that all APIs are per-bufmgr, so make this one the same.
Then, have it return -1 on failure so that the application can know what's
going on and do something sensible.
Signed-off-by: Keith Packard <keithp@keithp.com>
|
|
This wraps the new DRM_IOCTL_I915_GET_PIPE_FROM_CRTC_ID ioctl,
allowing applications to discover the pipe number corresponding
to a given CRTC ID. This is necessary for doing pipe-specific
operations such as waiting for vblank on a given CRTC.
|
|
Scanout buffers need to be freed through the kernel as it holds a reference
to them; exposing this API allows applications allocating scanout buffers to
flag them as not reusable.
Signed-off-by: Keith Packard <keithp@keithp.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
|
|
Signed-off-by: Alan Coopersmith <alan.coopersmith@sun.com>
|
|
Add assertions to drm_intel_gem_bo_reference,
drm_intel_gem_bo_reference_locked and drm_intel_gem_bo_unreference_locked
that the object has not been freed (refcount > 0). Mistakes in refcounting
lead to attempts to insert a bo into a free list more than once which causes
application failure as empty free lists are dereferenced as buffer objects.
Signed-off-by: Keith Packard <keithp@keithp.com>
|
|
Fixes assertion failures on later use of the object.
|
|
|
|
|
|
|
|
This reverts commit cd5c66c659168cbe2e3229ebf8be79f764ed0ee1. It broke too
many kernel assumptions about the double ioctl (connector status, mode
fetching, etc.)
|
|
This patch speeds up drmModeGetConnector by pre-allocating mode &
property info space before calling into the kernel. In many cases this
pre-allocation will be sufficient to hold the returned values (it's easy
enough to tweak if the common case becomes larger), which means we don't
have to make the second call, which saves a lot of time.
Acked-by: Jakob Bornecrantz <wallbraker@gmail.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
|
|
libdrm has some support for GTT mapping already, but there are bugs
with it (no surprise since it hasn't been used much).
In fixing 20803, I found that sharing bo_gem->virtual was a bad idea,
since a previously mapped object might not end up getting GTT mapped,
leading to corruption. So this patch splits the fields according to
use, taking care to unmap both at free time (but preserving the map
caching).
There's still a risk we might run out of mappings (there's a sysctl
tunable for max number of mappings per process, defaulted to 64k or so
it looks like) but at least GTT maps will work with these changes (and
some others for fixing PAT breakage in the kernel).
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
|
|
|
|
|
|
|
|
|
|
- This was causing a significant memory leak.
|